Next Generation Psychological Embeddings
Lead Research Organisation:
UNIVERSITY COLLEGE LONDON
Department Name: Experimental Psychology
Abstract
People have vast knowledge bases that allow them to represent relevant information about the world and act upon it. While there are specialist computer systems that best people in specific tasks, such as playing chess, humans are still the champions at being generalists. How do humans represent their rich knowledge so that they can appreciate similarities between objects, whether those similarities rest on superficial properties or deep connections, such as belonging to a shared biological category? It is a difficult question to answer that has both theoretical and practical ramifications. Understanding how people perceive the world is key to predicting and improving human behaviour. Likewise, building such representational or embedding spaces would provide a powerful tool for making AI systems more human-like.
Standard techniques for inferring psychological representations have been in wide use since the 1950s, but are limited in important ways. Standard techniques are data hungry and computationally slow. As a consequence, these techniques do not work well with real-world problems that often contain more than a million items. We aim to help Psychology transition to large-scale modelling, which we hope leads to a revolution like that experienced a decade ago in machine learning and AI when those fields moved to large-scale datasets. Another limitation of standard techniques is that they can't detect or take advantage of the relationships between items in the representational space. For example, if people know that two breeds of dogs are both dogs, even if they differ in size, they use that structural knowledge when making inferences. Our modelling approaches can discover and use such conceptual relationships. Likewise, we can capture how different groups, who vary in their life experiences, may represent the world in slightly different ways. In doing so, we can capture the uniqueness of each human experience rather than force a one-size-fits-all approach on the data, which would be a kind of tyranny of the majority for data science and Psychology. Finally, our methods can be adapted to take advantage of different approaches to measuring similarity.
Collectively, these limitations in standard approaches block the transfer of laboratory insights into real-world settings. While work has been done to address some of these limitations, no work has addressed all these limitations fully at once. We aim to do so at scale considering two databases of natural images (i.e., photographs) that each contain over a million images. Rather than offer an incremental advance, we aim to advance the state-of-the-art for representing spaces by more than an order of magnitude in size and improve the quality of the solution by capturing relations between images and groups of people as discussed above. We will make these resources and the tools publicly and freely available with guidance on how they can be extended to support others' work, whether it be in Psychology, Education, Human-Computer Interaction, AI, or other fields. Inferring psychological representations for these two large datasets will remove a long-standing hurdle in the research community, which should help machine learning and cognitive science researchers create better models of human cognition. This new framework and resource will make it possible to model differences between individuals, allowing us to better understand how different life experiences, such as measured by age, gender, and geographical location, impact how we think about the world.
Standard techniques for inferring psychological representations have been in wide use since the 1950s, but are limited in important ways. Standard techniques are data hungry and computationally slow. As a consequence, these techniques do not work well with real-world problems that often contain more than a million items. We aim to help Psychology transition to large-scale modelling, which we hope leads to a revolution like that experienced a decade ago in machine learning and AI when those fields moved to large-scale datasets. Another limitation of standard techniques is that they can't detect or take advantage of the relationships between items in the representational space. For example, if people know that two breeds of dogs are both dogs, even if they differ in size, they use that structural knowledge when making inferences. Our modelling approaches can discover and use such conceptual relationships. Likewise, we can capture how different groups, who vary in their life experiences, may represent the world in slightly different ways. In doing so, we can capture the uniqueness of each human experience rather than force a one-size-fits-all approach on the data, which would be a kind of tyranny of the majority for data science and Psychology. Finally, our methods can be adapted to take advantage of different approaches to measuring similarity.
Collectively, these limitations in standard approaches block the transfer of laboratory insights into real-world settings. While work has been done to address some of these limitations, no work has addressed all these limitations fully at once. We aim to do so at scale considering two databases of natural images (i.e., photographs) that each contain over a million images. Rather than offer an incremental advance, we aim to advance the state-of-the-art for representing spaces by more than an order of magnitude in size and improve the quality of the solution by capturing relations between images and groups of people as discussed above. We will make these resources and the tools publicly and freely available with guidance on how they can be extended to support others' work, whether it be in Psychology, Education, Human-Computer Interaction, AI, or other fields. Inferring psychological representations for these two large datasets will remove a long-standing hurdle in the research community, which should help machine learning and cognitive science researchers create better models of human cognition. This new framework and resource will make it possible to model differences between individuals, allowing us to better understand how different life experiences, such as measured by age, gender, and geographical location, impact how we think about the world.
Organisations
People |
ORCID iD |
| Bradley Love (Principal Investigator) | |
| Brett Roads (Researcher) |
Publications
Aho K
(2023)
Signatures of cross-modal alignment in children's early concepts.
in Proceedings of the National Academy of Sciences of the United States of America
Bröker F
(2024)
Demystifying unsupervised learning: how it helps and hurts.
in Trends in cognitive sciences
Dagaev N
(2023)
A too-good-to-be-true prior to reduce shortcut reliance
Dagaev N
(2023)
A too-good-to-be-true prior to reduce shortcut reliance.
in Pattern recognition letters
Lawrance A
(2024)
Shifting Perceptions: The Effects of Subordinate Level Training on Category Restructuring
in Journal of Vision
Love BC
(2023)
You can't play 20 questions with nature and win redux.
in The Behavioral and brain sciences
Luo X
(2025)
Coordinating multiple mental faculties during learning.
in Scientific reports
| Description | Perhaps not so much a key finding as setting the stage for one. We solved the fundamentally computationally challenges this grant proposal aimed to address. We are now applying this solution to collecting a massive embedding of images based on human judgments that will be useful for social scientists and computer scientists. |
| Exploitation Route | We have developed useful and professional quality tools. Once our large datasets are collected, they will be open and used by others. |
| Sectors | Creative Economy Digital/Communication/Information Technologies (including Software) Other |
| Description | The impact will occur shortly when we publish our tool and share our code, along with the massive open datasets we are collecting. |
| First Year Of Impact | 2025 |
| Sector | Creative Economy,Digital/Communication/Information Technologies (including Software),Retail,Other |
| Impact Types | Cultural Societal Economic Policy & public services |
| Title | ImageNet-Train HSJ |
| Description | The ImageNet Training Set Human Similarity Judgements dataset is a large-scale collection of human similarity judgments collected from thousands of participants. Participants viewed displays of nine images, where the center image was the anchor and the surrounding images were options. Participants were tasked with selecting the three option images that they considered most similar to the anchor image. When making their three choices, they also indicated the corresponding rank (i.e., 1st, 2nd, 3rd most similar). Images were drawn from the ImageNet training set, which is comprised of over 1.2 million unique images. To the best of our knowledge, this dataset will represent that largest publicly available dataset of its kind, being an order of magnitude larger than our previous effort, which involved 50,000 unique images. |
| Type Of Material | Database/Collection of data |
| Year Produced | 2025 |
| Provided To Others? | No |
| Impact | * This dataset enables the inference of psychological embeddings and analysis of human-perceived similarity at a state-of-the-art scale. |
| Title | ImageNet-Val HSJ Hierarchical Variational Inference Model |
| Description | A hierarchical variational inference model combines the efficient scalability of variational inference with the efficient data-harvesting properties of hierarchical architectures. By leveraging hierarchical labels (such as those available for the ImageNet dataset), one can construct a multi-level hierarchical model that goes from most abstract to most concrete (e.g., thing -> animate object -> mammal -> German Shepherd -> german_shepherd_123.jpg). This architecture has the advantage of letting information propagate from data points of high certainty to data points of low certainty. For example, if the embedding coordinate of image german_shepherd_123.jpg is known with high certainty and the embedding coordinate image german_shepherd_456.jpg is known with low certainty, the hierarchical model uses a prior to assume that the low-certainty image is located nearby since both images are members of the German Shepherd category.` |
| Type Of Material | Computer model/algorithm |
| Year Produced | 2024 |
| Provided To Others? | No |
| Impact | When hierarchical labels are known, a hierarchical variational inference model enables researchers to infer higher quality psychological embeddings from the available human similarity judgments. |
| Title | Mephisto Fork: Prolific Patch |
| Description | Mephisto is a open source software produced by Meta that aims to make running online experiments easier. BDR has created a fork of the repository to address bugs and expand functionality so that the software can be used with the latest Prolific API. |
| Type Of Technology | Webtool/Application |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | * Enables Mephisto to be used with Prolific since the existing implementation was incomplete and out-dated. |
| URL | https://github.com/roads/Mephisto/tree/roads/prolific_api_v1 |
| Title | Mephisto Fork: Subunits package |
| Description | Mephisto is a open source software produced by Meta that aims to make running online experiments easier. BDR has created a fork of the repository to make it easy for uses to create arbitrarily complex multi-trial sessions. |
| Type Of Technology | Webtool/Application |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | * The overhead for designing and deploying multi-trial experiments is significantly reduced. * The package provides a JSON schema for organizing collected experiment data. |
| URL | https://github.com/roads/Mephisto/tree/roads/subunits |
| Title | PsiZ Release 0.10.0 |
| Description | PsiZ (https://github.com/psiz-org/psiz) is an open-source python package that provides computational tools for modeling how people perceive the world. The primary use case of PsiZ is to infer psychological representations from human behavior (e.g., similarity judgments). The package integrates cognitive theory with modern computational methods. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | This release adds some new features, but primarily focuses on reorganization of the core functionality of PsiZ into easy-to-use percept, proximity, and behavior modules. |
| URL | https://github.com/psiz-org/psiz/releases/tag/v0.10.0 |
| Title | PsiZ Release 0.10.0 |
| Description | This release adds some new features, but primarily focuses on reorganization of the core functionality of PsiZ into easy-to-use percept, proximity, and behavior modules. * Source code related to architecting a "similarity function" has been reorganized to reflect the more general notion of a "proximity function" (which subsumes the notion of similarity, dissimilarity, kernel, and distance). * Some functions have been re-homed as "activation" layers. * RankSimilarity and RateSimilarity have been deprecated in favor of SoftRank and Logistic. In practice, this change means that users have increased responsibility for wiring up the layers, but model readability is now better. Examples, tutorials, and tests have all been updated to reflect these changes. To update code please see the examples. * Cell-based layers have been moved to experimental until a stable API can be determined. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | * Enables use of the latest TensorFlow python package. |
| URL | https://github.com/psiz-org/psiz/releases/tag/v0.10.0 |
| Title | PsiZ Release 0.11.0 |
| Description | Bumps the TensorFlow and TensorFlow Probability requirements to the latest versions and updates the PsiZ codebase to be compatible with the latest versions. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | * Updates code to use latest version of TensorFlow and TensorFlow Probability. |
| URL | https://github.com/psiz-org/psiz/releases/tag/v0.11.0 |
| Title | PsiZ Release 0.12.2 |
| Description | * Update models and layers to use Keras 3 API (which is backend agnostic). * Update gate objects. * Remove deprecated classes and associated tests: |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | * The PsiZ package now has more flexibility to use backends besides TensorFlow, enabling a wider range of users, such as those that prefer PyTorch. |
| URL | https://github.com/psiz-org/psiz/releases/tag/v0.12.2 |
| Title | PsiZ Release 0.8.0 |
| Description | PsiZ (https://github.com/psiz-org/psiz) is an open-source python package that provides computational tools for modeling how people perceive the world. The primary use case of PsiZ is to infer psychological representations from human behavior (e.g., similarity judgments). The package integrates cognitive theory with modern computational methods. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | This release refocuses the package on essential modeling components. Some modules and functionality has been deprecated or reworked since that functionality is better provided by a third-party. This policy shift will smoothen the road to a stable 1.0 release. |
| URL | https://github.com/psiz-org/psiz/releases/tag/v0.8.0 |
| Title | PsiZ Release 0.9.0 |
| Description | PsiZ (https://github.com/psiz-org/psiz) is an open-source python package that provides computational tools for modeling how people perceive the world. The primary use case of PsiZ is to infer psychological representations from human behavior (e.g., similarity judgments). The package integrates cognitive theory with modern computational methods. |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | Update documentation and online tutorials to improve onboarding and usability. |
| URL | https://github.com/psiz-org/psiz/releases/tag/v0.9.0 |
| Description | Brad Love has given 19 invited talks citing this award since the start of the award. |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | I have given departmental seminars and invited symposium in the UK and internationally. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Bradley Love gave 12 invited talks relted to this project in the last 12 months. |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | These were mostly pure dissemination of research. |
| Year(s) Of Engagement Activity | 2024,2025 |
| Description | Guest lecture in Psyc 576E, University of Victoria, Canada |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Undergraduate students |
| Results and Impact | Presented an introductory lecture on psychological embeddings in an undergraduate/graduate seminar (Psyc 576E: The psychology of concepts, categories & decision making; taught by Prof. Jim Tanaka). The lecture served as a general introduction to the concepts and computational tools used for inferring psychological embeddings. Students in the course later used a open-source software package I created to infer psychological embeddings for their own research problem. The lecture provided me an opportunity to present my work and open-source tools to the next generation of scientist and grow my base of users. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Neuro-AI-Talks (NEAT) 2023 |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Postgraduate students |
| Results and Impact | Attended the NEAT 2023 workshop in Osnabrück, Germany and presented the poster: Roads, B. D. & Love, B. C (2023). The Tradeoffs of Interpretable Dimensions in Psychological Embeddings. Poster session presented at: Neuro-AI Talks; 2023 September 24-25; Osnabrück The poster was an opportunity to describe my current research to a new audience and solicit feedback. The presentation resulted in social networking and stimulating discussions. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://www.kietzmannlab.org/neat2023/ |
| Description | Podcasts and media coverage: https://touchneurology.com/podcast/braingpt-advancing-neuroscientific-research-with-ai/ , https://www.youtube.com/watch?v=Qgrl3JSWWDE , https://www.nature.com/articles/s41562-024-02046-9/metrics , https://www.youtube.com/watch?v=tNRMWNfrkxc , https://www.youtube.com/watch?v=EA44JEJPrc0 |
| Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Public/other audiences |
| Results and Impact | Pure dissemination of research and value of research |
| Year(s) Of Engagement Activity | 2024,2025 |