Natural Fingerprints of Large Language Models

Suzuki, Teppei; Ri, Ryokan; Takase, Sho

Computer Science > Computation and Language

arXiv:2504.14871 (cs)

[Submitted on 21 Apr 2025]

Title:Natural Fingerprints of Large Language Models

Authors:Teppei Suzuki, Ryokan Ri, Sho Takase

View PDF HTML (experimental)

Abstract:Large language models (LLMs) often exhibit biases -- systematic deviations from expected norms -- in their outputs. These range from overt issues, such as unfair responses, to subtler patterns that can reveal which model produced them. We investigate the factors that give rise to identifiable characteristics in LLMs. Since LLMs model training data distribution, it is reasonable that differences in training data naturally lead to the characteristics. However, our findings reveal that even when LLMs are trained on the exact same data, it is still possible to distinguish the source model based on its generated text. We refer to these unintended, distinctive characteristics as natural fingerprints. By systematically controlling training conditions, we show that the natural fingerprints can emerge from subtle differences in the training process, such as parameter sizes, optimization settings, and even random seeds. We believe that understanding natural fingerprints offers new insights into the origins of unintended bias and ways for improving control over LLM behavior.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.14871 [cs.CL]
	(or arXiv:2504.14871v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.14871

Submission history

From: Teppei Suzuki [view email]
[v1] Mon, 21 Apr 2025 05:48:52 UTC (665 KB)

Computer Science > Computation and Language

Title:Natural Fingerprints of Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Natural Fingerprints of Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators