The Problem of Alignment

social-sciences

hci

prompt-engineering

Language models need alignment with human values to avoid reproducing biases. This relationship shapes linguistic theories and practice.

Authors

Tsvetelina Hristova

Liam Magee

Karen Soldatic

Published

December 30, 2023

Three Major Takeaways

Alignment in Large Language Models: Large language models (LLMs), such as ChatGPT, have raised concerns about the coordination of verbal behavior of autonomous machines with human interests, specifically through the problem of alignment. This problem encompasses whether LLMs can reconstruct and comprehend human language communication, how their outputs correspond with human expectations about their referents, and whether these outputs exhibit desirable moral agency.
Challenges of Alignment: The problem of alignment presents challenges related to syntactic and pragmatic competencies, semantic competency, and deontological questions about the outputs of LLMs. Alignment is viewed as an overarching concern with the possibility of uncovering or imposing structural rules and control on the relationship between language and automation.
Structuralism and Statistical Probability: The historical and theoretical work of the Moscow Linguistic School, as well as contemporaneous debates about statistical probability, reveal an interplay between probabilities and structure that has shaped the understanding of language and computation. This interplay has been concerned with the relationship between structure, statistical probability, and communication, influencing the development of mathematical linguistics and quantification of linguistic use.

Critique

While the paper provides a comprehensive overview of the problem of alignment and its historical context, it could benefit from more specific examples and empirical evidence to support its claims. Additionally, the discussion of prompt engineering and experiments with ChatGPT could be further elaborated to provide a deeper understanding of the practical implications of alignment in LLMs. Further research and case studies could enhance the applicability of the paper’s findings to real-world scenarios.

Appendix

Model	gpt-3.5-turbo-1106
Date Generated	2024-02-26
Abstract	http://arxiv.org/abs/2401.00210v1
HTML	https://browse.arxiv.org/html/2401.00210v1
Truncated	True
Word Count	20502