How to Fine-Tune BERT for Text Classification?

来源 :第十八届中国计算语言学大会暨中国中文信息学会2019学术年会 | 被引量 : 0次 | 上传用户:SHIWENBEI
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Language model pre-training has proven to be useful in learning universal language representations.As a state-of-the-art language model pre-training model,BERT(Bidirectional Encoder Representations from Transformers)has achieved amazing results in many language understanding tasks.In this paper,we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT finetuning.Finally,the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.
其他文献
This present study aims to investigate the colligational structures in China English.A corpus-based and comparative methodology was adopted in which three verbs of communication(discuss,communicate an
Answer selection(AS)is an important subtask of question answering(QA)that aims to choose the most suitable answer from a list of candidate an-swers.Existing AS models usually explored the single-scale
In recent years,machine reading comprehension is becoming a more and more popular research topic.Promising results were obtained when the machine reading comprehension task had only two inputs,context
Most of the current man-machine dialogues are at the two end-points of a spectrum of dialogues,i.e.goal-driven dialogues and non goal-driven chitchats.Document-driven dialogues provide a bridge betwee
Natural language inference(NLI)is a challenging task to determine the relationship between a pair of sentences.Existing Neural Network-based(NN-based)models have achieved prominent success.However,rar
In this paper,we present a neural model to map structured table into document-scale descriptive texts.Most existing neural net-work based approaches encode a table record-by-record and generate long s
Word embeddings have a significant impact on natural lan-guage processing.In morpheme writing systems,most Chinese word em-beddings take a word as the basic unit,or directly use the internal structure
Dropped pronoun recovery,which aims to detect the type of pronoun dropped before each token,plays a vital role in many applications such as Machine Translation and Information Extraction.Recently,deep
Distant supervision for relation extraction has been widely used to construct training set by aligning the triples of the knowledge base,which is an efficient method to reduce human efforts.However,th
In relation extraction,directly adopting a model trained in the source domain to the target domain will suffer greatly performance decrease.Existing studies extract the shared features between domains