Publication at the scientific conference ICCSA’2023

Categories:

Publication at the scientific conference ICCSA’2023

We are pleased to announce that our paper entitled "An empirical comparison of Transformer-based models in Vulnerability Prediction" has been accepted at the scientific conference entitled "International Conference on Computational Science and Its Applications (ICCSA 2023)".

A version can be found at the link below:

https://lnkd.in/dm36t24X

 

In this paper, we examine the ability of various Transformer-based models to predict the existence of software vulnerabilities. The growing flourishing of language models provides a new direction to tackle downstream tasks such as text classification. Vulnerability prediction is a problem that has been heavily associated with text mining techniques and therefore can benefit from pre-trained natural language processing (NLP) models. In particular, we empirically tested a multitude of pre-trained NLP models in the post-task of text mining-based vulnerability prediction, highlighting any potential differences in their performances and thereby determining the optimal choice among them. To this end, we fitted several large pre-trained models to a dataset with vulnerability-related labels. We evaluated BERT, GPT-2, BART, and various variants of BERT. The findings show that CodeBERT, which is not only pre-trained on natural language but on NL and PL pairs, proved to be the superior model in our analysis. We should also notice that BERT achieves performance close to CodeBERT, even if it is not pre-trained in programming languages.

This scientific paper is part of the VM4SEC project.