A word's meaning resides in the heart and soul of its "generator" - people. How do we include human (personal, social, cultural, situational) context, ethically, into LLMs -- the base models of our NLP systems?

Overview

Language modeling in the context of its source (author) and target (audience) can enable NLP systems to better understand human language. Advances in human-centered NLP have established the importance of modeling the human context holistically, including personal, social, cultural, and situationa factors in NLP systems. Yet, our NLP systems have become heavily reliant on large language models that do not capture the human context.

Human language is highly dependent on the rich and complex human context such as (a) who is speaking, (b) to whom, (c) where (situation/environment) and (d) when (time and place). It is additionally moderated by the changing human states of being such as their mental and emotional states.

Current large language models can possibly simulate some form of human context given their large scale of parameters and pre-training data. However, they do not explicitly process language in the higher order structure of language – connecting documents to people, the "source" of the language.

Prior work has demonstrated the benefits of including the author’s information using LLMs for downstream NLP tasks. Recent research has also shown that LLMs can benefit from including additional author context in the LM pre-training task itself. Progress in the direction of merging the two successful parallels, i.e., human-centered NLP and LLMs, drives us toward creating a vision of human-centered LLMs for the future of NLP in the era of LLMs.


Call for Papers

Human-centered large language modeling has the potential to bring promising improvements in human-centric applications through multiple domains such as healthcare, education, consumerism, etc. Simultaneously, this new research focus also brings multitudes of unexplored architectural, data, technical, fairness, and ethical challenges.

We invite submissions on topics that include, but are not limited to:

  • Human-centered LLM training/fine-tuning: Strategies to include the human context of the speaker and/or addressee, such as their personal factors, social context, etc.; Integrating group and/or individual human characteristics and traits; Human language modeling with multi-lingual LLMs or low-resource languages
  • Analysis and Applications: Evaluations for human language modeling that demonstrates personalized or socially contextual language understanding; Empirical findings with human language modeling demonstrating failure cases with an exhaustive analysis of negative results; Bias measurement and bias mitigation using human language modeling; Applications built on top of LLMs for real-world uses or translational impact
  • Datasets: Obtaining data for training and evaluating human contextualized LLM models
  • Position papers: Position papers on opportunities and challenges, including ethical risks

With our workshop, we aim to create a platform where researchers can present rising challenges and solutions in building human-centered NLP models that bring together the ideas of human and social factors adaptation into the base LLMs of our NLP systems.


Archival Submissions

Authors are invited to submit long (8 pages) or short (4 pages) papers, with unlimited pages for references and appendices. Following the ACL conference policy, authors of approved papers will be given an additional page for the final, camera-ready versions of their papers.

Please ensure that the submissions are formatted according to the ACL template style. You can access the template here.

Non-Archival Submissions

We welcome non-archival submissions through two tracks.

  • (1) Extended Abstract: First, you can submit an extended abstract of work not published elsewhere, of length 2-4 pages + 2 pages for references. This can include position papers, or early stage work that would benefit from peer feedback. These submissions will also be peer-reviewed in a double-blind fashion, similar to the archival papers. Please use the OpenReview submission links below for submission of non-archival (extended abstract).
  • (2) Published Papers: Additionally, work previously published, or accepted to be published elsewhere (e.g., ACL Findings) can also be submitted to the non-archival track, along with details about the venue or journal where it is accepted, and a link to the archived version, if available. These papers will be reviewed in a single-blind fashion, and will be reviewed only for the fit to the workshop theme, and do not have any page limits. We will release a form for submission of non-archival (published papers) soon.

Please ensure that the submissions are formatted according to the ACL template style. You can access the template here. Accepted papers in the two non-archival tracks will be given an opportunity to present the work at the workshop, but will not be published in the ACL Anthology.


Important Dates

  • May 10 (Fri), 2024: Direct paper submission deadline (archival and non-archival extended abstract)
  • May 17 (Fri), 2024: ARR commitment deadline (Submission of already ARR-reviewed papers with the paper link)
  • June 17 (Mon), 2024: Notification of acceptance
  • June 25 (Tue), 2024: Non-Archival (published papers) submission deadline
  • July 1 (Mon), 2024: Camera-ready paper due
  • August 15 (Thu), 2024: Workshop date

All deadlines are 11:59 pm UTC -12h ("Anywhere on Earth").


Submission Links

Note: All authors must have an OpenReview profile. Please ensure profiles are complete before submission. As per OpenReview's moderation policy for newly created profiles:

  • New profiles created without an institutional email will go through a moderation process that can take up to two weeks.
  • New profiles created with an institutional email will be activated automatically.

If you have any questions, please contact us at: workshophucllm@googlegroups.com


Topics of Interest

The areas of interest include:

  • LLM training/fine-tuning strategies to include the human context of the speaker and/or addressee, such as their personal factors, social context etc.
  • LLM training integrating group and/or individual human characteristics and traits
  • Evaluations for human language modeling that demonstrates personalized or socially contextual language understanding
  • Bias measurement and bias mitigation using human language modeling
  • Obtaining data for training and evaluating human contextualized LLMs models
  • Human language modeling with multi-lingual LLMs or low-resource languages
  • Position papers on opportunities and challenges, including ethical risks
  • Empirical findings with human language modeling demonstrating failure cases with an exhaustive analysis of negative results
  • Applications built on top of LLMs for real-world uses or translational impact


Keynote Speakers

Daniel Hershcovich
Daniel Hershcovich
University of Copenhagen, Denmark
Snigdha Chaturvedi
Snigdha Chaturvedi
University of North Carolina at Chapel Hill, USA
Sebastian Ruder
Sebastian Ruder
Google, Germany

Panelists

Carolyn Rosé
Carolyn Rosé
Carnegie Mellon University, USA
Kayden Jordan
Kayden Jordan
Harrisburg University of Science and Technology, USA
Debora Nozza
Debora Nozza
Bocconi University, Italy
Diyi Yang
Diyi Yang
Stanford University, USA
Sebastian Ruder
Sebastian Ruder
Google, Germany
Sara Hooker
Sara Hooker
Cohere AI, USA

Tentative Schedule

Time Schedule
9:00 - 10:00 Keynote Prof. Daniel Hershcovich (UCPH): Cross-cultural alignments in LLMs
10:00 - 11:00 Poster session 1
11:00 - 12:00 Paper presentation slots 1
12:00 - 13:00 Lunch break
13:00 - 14:00 Keynote Dr. Sebastian Ruder (Google): Building Multilingual LLMs for User-centric Applications
14:00 - 15:00 Paper presentation slots 2
15:30 - 16:30 Keynote Prof. Snigdha Chaturvedi (UNCC): Socially-aware NLP
16:30 - 17:30 Panel discussion: including following topics: Where are human-centered LLMs important? How to achieve the vision of human-centered LLMs? Which ethical issues to keep in mind while creating human-centered LLMs?

Organizers

Nikita Soni
Nikita Soni
Stony Brook University, USA
Lucie Flek
Lucie Flek
University of Bonn, Germany
Ashish Sharma
Ashish Sharma
University of Washington, USA
Diyi Yang
Diyi Yang
Stanford University, USA
Sara Hooker
Sara Hooker
Cohere AI, USA
H Andrew Schwartz
H Andrew Schwartz
Stony Brook University, USA

If you have any questions, please contact us at: workshophucllm@googlegroups.com


Program Committee

  • Amanda Curry, Bocconi University, Italy
  • Barbara Plank, LMU Munich, Germany
  • Cesa Salaam, Howard University, USA
  • Chia-Chien Hung, NEC Labs Europe, Germany
  • Dan Goldwasser, Purdue University, USA
  • Daniel Preotiuc-Pietro, Bloomberg, USA
  • Debora Nozza, Bocconi University, Italy
  • Francesco Barbieri, Snap Research, USA
  • Gavin Abercombie, Heriot-Watt University, Scotland
  • Giuseppe Attanasio, Bocconi University, Italy
  • Harmanpreet Kaur, University of Michigan, USA
  • Hye Sun Yun, Northeastern University, USA
  • Ian Stewart, Pacific Northwest National Laboratory, USA
  • Inna Lin, University of Washington, USA
  • Jaemin Cho, University of North Carolina Chapel Hill, USA
  • Jielin Qiu, Carnegie Mellon University, USA
  • Joan Plepi, University of Marburg, Germany
  • Kokil Jaidka, National University of Singapore
  • Lucy Li, University of California, Berkeley, USA
  • Lucy Lu Wang, University of Washington, USA
  • Lyle Ungar, University of Pennsylvania, USA
  • Maarten Sap, Carnegie Mellon University, USA
  • Maria Antoniak, Allen Institute for AI, USA
  • Matthias Orlikowski, Bielefeld University, Germany
  • Meryem M'hamdi, University of Southern California, USA
  • Monica Munnangi, Northeastern University, USA
  • Salvatore Giorgi, University of Pennsylvania, USA
  • Sherry Tongshuang Wu, Carnegie Mellon University, USA
  • Shijia Liu, Northeastern University, USA
  • Shiran Dudy, Northeastern University, USA
  • Shreya havaldar, University of Pennsylvania, USA
  • Silvio Amir, Northeastern University, USA
  • Tal August, Allen Institue for AI, USA
  • Vivek Kulkarni, University of California, Santa Barbara, USA
  • Zeerak Talat, Independent Researcher