본문 바로가기

자연어처리12

[논문리뷰] PROMPTING GPT-3 TO BE RELIABLE Prompting GPT-3 To Be ReliableLarge language models (LLMs) show impressive abilities via few-shot prompting. Commercialized APIs such as OpenAI GPT-3 further increase their use in real-world language applications. However, the...https://arxiv.org/abs/2210.09150Abstract1 Introduction2. FACET 1: GENERALIZABILITYExperiment SetupResultTakeaway3 FACET 2: SOCIAL BIAS AND FAIRNESS3.1 THE CASE OF GENDER.. 2023. 9. 7.
[논문리뷰] __On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning__ On Second Thought, Let's Not Think Step by Step! Bias and...Generating a Chain of Thought (CoT) has been shown to consistently improve large language model (LLM) performance on a wide range of NLP tasks. However, prior work has mainly focused on logical...https://arxiv.org/abs/2212.08061논문의 attributionAbstract1. Introduction2. Related Work3. Stereotype & Toxicity Benchmarks이3.1 Stereotype Benchm.. 2023. 9. 7.
[논문리뷰] Legal Prompting: Teaching a Language Model to Think Like a Lawyer Legal Prompting: Teaching a Language Model to Think Like a LawyerLarge language models that are capable of zero or few-shot prompting approaches have given rise to the new research area of prompt engineering. Recent advances showed that for example...https://arxiv.org/abs/2212.01326논문의 attributionAbstract1. Introduction2 Legal Entailment task3 Prior work4 Experiments and results 4.1 Zero-shot (Z.. 2023. 9. 7.
[논문리뷰] FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing Abstract1. IntroductionContribution2. Related Work3. Benchmark DatasetsECtHR(The European Court of Human Rights)SCOTUS(The US Supreme Court)FSCS(The Federal Supreme Court of Switzerland)CAIL(The Supreme People’s Court of China)4. Fine-tuning AlgorithmsERMGroup DROV-RExIRMAdversarial Removal5. Experimental SetupModels6. ResultGroup Disparity AnalysisCross-Attribute Influence AnalysisGroup Robust .. 2023. 9. 4.