Like Father, Like Son: Kinship-Aware Preference Mapping (KARMA) for Automatic Alignment in Large Language Models

Jung, Jeesu; Park, Chanjun; Jung, Sangkeun

Computer Science > Artificial Intelligence

arXiv:2502.18744 (cs)

[Submitted on 26 Feb 2025]

Title:Like Father, Like Son: Kinship-Aware Preference Mapping (KARMA) for Automatic Alignment in Large Language Models

Authors:Jeesu Jung, Chanjun Park, Sangkeun Jung

View PDF HTML (experimental)

Abstract:Recent advancements in Large Language Model (LLM) alignment have sought to mitigate the cost of human annotations by leveraging pretrained models to generate preference data. However, existing methods often compare responses from models with substantially different capabilities, yielding superficial distinctions that fail to provide meaningful guidance on what constitutes a superior response. To address this limitation, we propose Kinship-Aware pReference MApping (KARMA), a novel framework that systematically pairs responses from models with comparable competencies. By constraining preference comparisons to outputs of similar complexity and quality, KARMA enhances the informativeness of preference data and improves the granularity of alignment signals. Empirical evaluations demonstrate that our kinship-aware approach leads to more consistent and interpretable alignment outcomes, ultimately facilitating a more principled and reliable pathway for aligning LLM behavior with human preferences.

Comments:	14 pages,5 figures,3 tables,4 graphs
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2502.18744 [cs.AI]
	(or arXiv:2502.18744v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2502.18744

Submission history

From: Jeesu Jung [view email]
[v1] Wed, 26 Feb 2025 01:36:40 UTC (1,513 KB)

Computer Science > Artificial Intelligence

Title:Like Father, Like Son: Kinship-Aware Preference Mapping (KARMA) for Automatic Alignment in Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Like Father, Like Son: Kinship-Aware Preference Mapping (KARMA) for Automatic Alignment in Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators