Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations ...
	Niu, Changan; Li, Chuanyi; Ng, Vincent; Ge, Jidong; Huang, Liguo; Luo, Bin. - : arXiv, 2022
	Abstract: Recent years have seen the successful application of large pre-trained models to code representation learning, resulting in substantial improvements on many code-related downstream tasks. But there are issues surrounding their application to SE tasks. First, the majority of the pre-trained models focus on pre-training only the encoder of the Transformer. For generation tasks that are addressed using models with the encoder-decoder architecture, however, there is no reason why the decoder should be left out during pre-training. Second, many existing pre-trained models, including state-of-the-art models such as T5-learning, simply reuse the pre-training tasks designed for natural languages. Moreover, to learn the natural language description of source code needed eventually for code-related tasks such as code summarization, existing pre-training tasks require a bilingual corpus composed of source code and the associated natural language description, which severely limits the amount of data for pre-training. To ... : Accepted by ICSE 2022, not the final version ...
	Keyword: FOS Computer and information sciences; Software Engineering cs.SE
	URL: https://arxiv.org/abs/2201.01549 https://dx.doi.org/10.48550/arxiv.2201.01549
	BASE
	Hide details

Search in the Catalogues and Directories