Automated Smart Contract Summarization via LLMs
programming
prompt-engineering
Gemini-Pro-Vision outperforms MMTrans in generating contract code summarization from multimodal inputs.
Summary:
- The study evaluates the performance of Gemini-Pro-Vision in generating contract code summarization from multimodal inputs.
- It compares Gemini-Pro-Vision to MMTrans and explores methods to build the best prompt for multimodal inputs.
- The study uses widely used metrics (BLEU, METEOR, and ROUGE-L) to measure the quality of the generated summarization.
Major Findings:
- Evaluation of Gemini-Pro-Vision: The study shows that Gemini-Pro-Vision achieves 21.17% and 21.05% scores for code comments generated by three-shot prompts, which are better than those generated by one-shot and five-shot prompts.
- Comparison with MMTrans: Gemini-Pro-Vision’s performance is compared to MMTrans, and it is found that MMTrans significantly outperforms Gemini-Pro-Vision in terms of METEOR, BLEU, and ROUGE-L scores.
- Performance Metrics: The study presents the overall performance of Gemini-Pro-Vision in one-shot, three-shot, and five-shot prompts compared with MMTrans, showing variations in scores for different prompts.
Analysis and Critique:
- Benefit: Gemini-Pro-Vision generates more concise code comments and exhibits stronger reasoning ability.
- Limitation: The study identifies a lack of high-quality benchmark dataset and suitable metrics for evaluating comments generated by LLMs such as Gemini-Pro-Vision.
- Future Expectations: The study outlines opportunities and adjustments for utilizing Gemini-Pro-Vision to generate code comments, emphasizing the need for further exploration and investment in constructing a high-quality test dataset.
Overall, the study provides valuable insights into the performance of Gemini-Pro-Vision in generating code summarization and highlights areas for future research and improvement. However, it also identifies limitations such as the lack of suitable evaluation metrics and the need for a high-quality benchmark dataset.
Appendix
Model | gpt-3.5-turbo-1106 |
Date Generated | 2024-02-26 |
Abstract | https://arxiv.org/abs/2402.04863v1 |
HTML | https://browse.arxiv.org/html/2402.04863v1 |
Truncated | False |
Word Count | 5487 |