{"id":892,"date":"2024-09-02T12:09:14","date_gmt":"2024-09-02T11:09:14","guid":{"rendered":"https:\/\/metrics.blogg.gu.se\/?p=892"},"modified":"2024-07-04T12:15:29","modified_gmt":"2024-07-04T11:15:29","slug":"federated-learning-in-code-summarization","status":"publish","type":"post","link":"https:\/\/metrics.blogg.gu.se\/?p=892","title":{"rendered":"Federated learning in code summarization&#8230;"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><a href=\"https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"419\" src=\"https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920-1024x419.jpg\" alt=\"\" class=\"wp-image-708\" srcset=\"https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920-1024x419.jpg 1024w, https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920-300x123.jpg 300w, https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920-768x314.jpg 768w, https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920-1200x491.jpg 1200w, https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920-1320x540.jpg 1320w, https:\/\/metrics.blogg.gu.se\/files\/2021\/12\/plastic-g4f6855c58_1920.jpg 1920w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/a><\/figure>\n\n\n\n<p><a href=\"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3661167.3661210\">3661167.3661210 (acm.org)<\/a><\/p>\n\n\n\n<p>So far, we have explored two different kinds of code summarization &#8211; either using a pre-trained model or training our own. However, both of them have severe limitations. The pre-trained models are often good, but too generic for the project at hand. The private models are good, but often require a lot of good data and processing power. In this article, the authors propose to use a third way &#8211; federated learning. <\/p>\n\n\n\n<p>The results show that: <\/p>\n\n\n\n<ul>\n<li>Fine-tuning LLMs with few parameters significantly improved code summarization capabilities. LoRA fine-tuning on 0.062% of parameters showed substantial performance gains in metrics like C-BLEU, METEOR, and ROUGE-L.<\/li>\n<\/ul>\n\n\n\n<ul>\n<li>The federated model matched the performance of the centrally trained model within two federated rounds, indicating the viability of the federated approach for code summarization tasks.<\/li>\n<\/ul>\n\n\n\n<ul>\n<li>The federated model achieved optimal performance at round 7, demonstrating that federated learning can be an effective method for training LLMs.<\/li>\n<\/ul>\n\n\n\n<ul>\n<li>Federated fine-tuning on modest hardware (40GB GPU RAM) was feasible and efficient, with manageable run-times and memory consumption.<\/li>\n<\/ul>\n\n\n\n<p>I need to take a look at this model a bit more since I like this idea. Maybe this is the beginning of the personalized bot-team that I always dreamt of?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>3661167.3661210 (acm.org) So far, we have explored two different kinds of code summarization &#8211; either using a pre-trained model or training our own. However, both of them have severe limitations. The pre-trained models are often good, but too generic for the project at hand. The private models are good, but often require a lot of &hellip; <a href=\"https:\/\/metrics.blogg.gu.se\/?p=892\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Federated learning in code summarization&#8230;&#8221;<\/span><\/a><\/p>\n","protected":false},"author":68,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,4,5],"tags":[],"_links":{"self":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts\/892"}],"collection":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/users\/68"}],"replies":[{"embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=892"}],"version-history":[{"count":1,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts\/892\/revisions"}],"predecessor-version":[{"id":893,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=\/wp\/v2\/posts\/892\/revisions\/893"}],"wp:attachment":[{"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=892"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=892"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/metrics.blogg.gu.se\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=892"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}