English language is filled with trait words like “caring” and “smart”
These words are the currency of personality/social psych, yet key questions remain about their evolution, function, and structure
We take on these questions in a preprint led by
@yuanzeliu.bsky.social
osf.io/preprints/ps... Jul 22, 2025 15:35The main goal of our paper was to create a new theoretical framework for understanding trait language (expanded later)
To do so, we decided to create a large and high-quality list of English trait words, something that psychologists have tried to do since the classic work of Allport and Galton
To create our list, we...
a) Fine-tuned and validated BERT models to be high-quality "trait detectors"
b) Used these models to annotate the entire vocabulary of Google Books
c) Checked these annotations using another LLM (GPT)
This process gave us 2847 trait words
We host this trait list on our OSF page (
osf.io/7k4yj/files/...), freely available to all
All traits are linked to:
-A definition
-Probability of being a trait word
-Loading on 24 dimensions (e.g., friendliness, openness)
-Date of origin
-Number of meanings
One observation from building this list is that being a “trait word” is not black and white
Traits are natural categories with fluid boundaries. Some words are almost exclusively used as traits, but many have multiple meanings
Our approach gives words a continuous probability of being traits
After collecting the words, we turned to their semantic structure
We took 24 dimensions from a review of personality and psych science lit. Then Prolific workers rated all words on these dimensions
This plot shows how some words are generalized (relevant to many dimensions); others are specific
In an EFA, these 24 dimensions reduced to a four-dimension FACT model
Fitness (e.g., Attractiveness)
Agency (e.g., Assertiveness)
Communion (e.g., Morality)
Traditionalism (e.g., Conservatism)
Our model contains parts of prior models, most notably the ABC model of social evaluation
Although this FACT model is interesting and parsimonious, it masks variation within factors, and links between factors
The network below shows all words connected by their overlapping semantics.
The FACT factors partly organize the network, but lots of complexity remains
For example, we find that Communion words are highly clustered (eg. words expressing friendliness tend to also express morality)
But words about traditionalism are spread across the network, closely knit with communion and agency
Most trait dimensions are evaluative – the have positively and negatively valenced poles (it is good to be friendly and bad to be unfriendly)
But some dimensions are not. Communion and agency are more evaluative than traditionalism or fitness dimensions
Are there more negative or positive trait words? I once assumed there were more negative words because of negativity dominance
The real story is complicated
Communion dimensions are indeed filled with negative words.
But we have more positive words about agency, traditionalism, and fitness
Negative and positive words are also vary in their specificity.
Negative are more likely to be specific, whereas positive words are more likely to generalize across many dimensions
Finally, we look at the history of trait words using dates of origin from the Oxford English Dictionary
This plot shocked me – our trait vocabulary has been remarkably stable over time
We have always had more words about communion and the fewest words about fitness
The small change we do see is a surge of “traditionalism” words around the rise of participatory democracies in Europe
“Conservativism-liberalism” is the “youngest” trait dimension by date of introduction
We also see interesting trends in the semantic coherence of these factors
We calculate coherence by looking at clustering in semantic space over time from word embeddings
Here are clusters of words projected based on their 19th and 20th century embeddings
We find that communion words are converging in semantic space—extending work on the collapse of morality into a single dimension (
cdr.lib.unc.edu/downloads/g7...)
But words about agency/competence are diverging in semantic space. They are becoming less related to each other over time
The general discussion of our paper reviews these findings using a new “functional constructionist” approach to trait words, building on past work in emotion
Functional ➡️ trait words are useful
Constructionist ➡️ There is no single trait space
Our approach makes several simple assumptions that encompass our different findings
1. Trait words reflect the nature of social dilemmas
2. We use trait words as diagnostic tools for prediction in these dilemmas
3. When social dilemmas change, we also change our trait language
- Trait words reflect the nature of social dilemmas -
Example: We view three of the FACT dimensions as mapping onto classic social dilemmas identified in game theory
-Communion ➡️ cooperation
-Traditionalism ➡️ coordination
-Agency ➡️ whether people can cooperate and coordinate
- We use trait words as diagnostic tools for prediction -
Example: We may have many negative communal words because cooperation is common, so it is more diagnostic to communicate about deviations from cooperation
But since skill is rare in most domains, we have many positive words about agency
- When social dilemmas change, trait language changes -
Example: The rising division of labor over the last 200 years has meant that people are selected on specific competences
This might be one reason why agency language has become more semantically heterogeneous
Our general discussion covers this framework in great depth, and uses it to make tentative inferences about all of our findings
This is a working paper, so any feedback would be helpful!
I also want to recognize the authors once more:
@yuanzeliu.bsky.social completed this massive project as a pre-doc, which speaks to his promise as a scholar
Alex Koch contributed endless theoretical insights over two years
Tessa Charlesworth gave some of the most useful and thorough comments I have ever read
And
@andyluttrell.bsky.social pushed this paper to the point it is now