Automated Smart Contract Summarization via LLMs

programming
prompt-engineering
Gemini-Pro-Vision outperforms MMTrans in generating contract code summarization from multimodal inputs.
Author

Yingjie Mao, Xiaoqi Li, Zongwei Li, Wenkai Li

Published

February 8, 2024

Summary:

  • The article evaluates the performance of Gemini-Pro-Vision in generating contract code summarization from multimodal inputs.
  • It compares Gemini-Pro-Vision to MMTrans and explores the use of multimodal prompts to generate contract code summarization.
  • The study uses widely used metrics (BLEU, METEOR, and ROUGE-L) to measure the quality of the generated summarization.

Major Findings:

  1. Gemini-Pro-Vision achieves 21.17% and 21.05% scores for code comments generated by three-shot prompts, which are better than those generated by one-shot and five-shot prompts.
  2. The performance of Gemini-Pro-Vision is compared to MMTrans, and it is found that MMTrans significantly outperforms Gemini in terms of METEOR, BLEU, and ROUGE-L.
  3. The length of comments generated by Gemini-Pro-Vision with one-shot prompts is lower than those generated by three-shot and five-shot prompts.

Analysis and Critique:

  • Benefits:
    • Gemini-Pro-Vision’s code comments are more concise and exhibit a stronger reasoning ability.
  • Limitations:
    • Lack of a high-quality benchmark dataset for evaluating Gemini-Pro-Vision’s performance.
    • Absence of suitable metrics for evaluating comments generated by LLMs such as Gemini-Pro-Vision.

The article provides valuable insights into the performance of Gemini-Pro-Vision in generating contract code summarization. However, it also highlights the need for a high-quality benchmark dataset and suitable evaluation metrics for LLMs-generated comments. Additionally, the study’s comparison with MMTrans indicates the need for further improvements in Gemini-Pro-Vision’s performance. Further research is required to address these limitations and enhance the capabilities of Gemini-Pro-Vision for generating code comments.

Appendix

Model gpt-3.5-turbo-1106
Date Generated 2024-02-26
Abstract https://arxiv.org/abs/2402.04863v2
HTML https://browse.arxiv.org/html/2402.04863v2
Truncated False
Word Count 5365