PromptCards Project

The rise of generative language models has revolutionized the field of natural language processing. These models opening up new possibilities for zero-shot data annotation and classification (Huang et al., 2023, Wu et al. 2023, Törnberg 2023, Gilardi et al. 2023). However, the performance of these models is highly dependent on the prompts and the hyperparameters used for annotation (Reiss 2023). Therefore, it is important in terms of reliability and reproducibility to share and validate the prompts and all other related parameters used for annotation.

PromptCards for Annotations (& zero-shot classification) is a simple approach to enhance the performance of generative language models, such as ChatGPT, FLAN-T5, Dolly etc., in the context of artificial annotation. This project provides a standardized method for using, sharing, and validating prompts for various tasks and datasets in research.

The prompts are structured in a template with prompt cards that provide detailed information about the author, date, language, NLP task, prompt text, limitations, research paper, and many additional information.

The project also provides an online archive is constantly growing as new prompts are contributed by the community. If you have a prompt that you would like to share, please consider uploading ☁️⬆️ it to the archive.

Literature

  1. Huang et al. (2023). Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate Speech
  2. Wu et al. (2023) "Large Language Models Can Be Used to Estimate the Ideologies of Politicians in a Zero-Shot Learning Setting"
  3. Törnberg (2023) "ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning"
  4. Gilardi et al. (2023) "ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks"
  5. Reiss (2023) "Testing the Reliability of ChatGPT for Text Annotation and Classification: A Cautionary Remark"
  6. Kuzman et al. (2023) "ChatGPT: Beginning of an End of Manual Linguistic Data Annotation? Use Case of Automatic Genre Identification"