https://openreview.net/forum?id=XTHfNGI3zT","text":"Published version: https://openreview.net/forum?id=XTHfNGI3zT"},"id":"2310.01188","title":"Quantifying the Plausibility of Context Reliance in Neural Machine\n Translation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2310.01188.png","upvotes":1,"publishedAt":"2023-10-02T13:26:43.000Z","isUpvotedByUser":false},{"_id":"65f86b8b68a80d887f6e9864","position":1,"type":"space","note":{"html":"Demo showcasing PECoRe usage with the `inseq attribute-context` CLI for decoder-only and encoder-decoder models.","text":"Demo showcasing PECoRe usage with the `inseq attribute-context` CLI for decoder-only and encoder-decoder models."},"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"blue","colorTo":"green","createdAt":"2024-01-23T16:21:03.000Z","emoji":"🐑 🐑","id":"gsarti/pecore","lastModified":"2024-04-24T14:20:00.000Z","likes":10,"pinned":true,"private":false,"repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":"zero-a10g","requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-pecore.hf.space","isCustom":false,"stage":"READY"}]},"shortDescription":"Analyze context usage in LM generations with model internals","title":"PECoRe","isLikedByUser":false},{"_id":"65edad33c36e79c45c4dc9af","position":2,"type":"dataset","note":{"html":"IWSLT 2017 dataset with document-level IDs. The English-French portion was used for context-aware MT training.","text":"IWSLT 2017 dataset with document-level IDs. The English-French portion was used for context-aware MT training."},"author":"gsarti","downloads":93,"gated":false,"id":"gsarti/iwslt2017_context","lastModified":"2023-05-07T14:09:24.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":5548640,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false},{"_id":"65edab863dfb67e13c432b00","position":3,"type":"dataset","note":{"html":"SCAT+ dataset used for further fine-tuning and evaluation on anaphoric pronouns","text":"SCAT+ dataset used for further fine-tuning and evaluation on anaphoric pronouns"},"author":"inseq","downloads":0,"gated":false,"id":"inseq/scat","lastModified":"2024-03-10T11:41:20.000Z","private":false,"repoType":"dataset","likes":1,"isLikedByUser":false}],"position":2,"theme":"orange","private":false,"shareUrl":"https://huggingface.co/collections/gsarti/pecore-iclr-2024-65edab42e28439e21b612c2e","upvotes":1,"isUpvotedByUser":false},{"slug":"gsarti/it5-lrec-coling-2024-6600468041d8fee2c42021c8","title":"🇮🇹 IT5 @ LREC/COLING 2024","description":"Materials for the paper \"IT5:Text-to-text Pretraining for Italian Language Understanding and Generation\" published at LREC/COLING 2024","lastUpdated":"2024-03-24T16:00:39.593Z","owner":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"items":[{"_id":"660046aabda47e9bcf8380dd","position":0,"type":"paper","id":"2203.03759","title":"IT5: Large-scale Text-to-text Pretraining for Italian Language\n Understanding and Generation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2203.03759.png","upvotes":3,"publishedAt":"2022-03-07T22:39:01.000Z","isUpvotedByUser":false},{"_id":"660046b2a4bca9c75d58bcab","position":1,"type":"space","author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"red","colorTo":"green","createdAt":"2022-03-10T10:23:00.000Z","emoji":"🤌","id":"gsarti/it5-demo","lastModified":"2024-04-16T14:31:09.000Z","likes":6,"pinned":true,"private":false,"repoType":"space","runtime":{"stage":"SLEEPING","hardware":{"current":null,"requested":"cpu-basic"},"storage":null,"gcTimeout":86400,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-it5-demo.hf.space","isCustom":false,"stage":"READY"}]},"shortDescription":"Test fine-tuned IT5 models for Italian language generation","title":"IT5 Demo","isLikedByUser":false},{"_id":"660046bc2a0cdd3c5988d183","position":2,"type":"dataset","author":"gsarti","downloads":0,"gated":"manual","id":"gsarti/itagen","lastModified":"2022-04-26T09:21:47.000Z","datasetsServerInfo":{"viewer":"preview","numRows":0,"tags":[],"libraries":[]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false},{"_id":"660046c88ae190912a2388aa","position":3,"type":"dataset","author":"gsarti","downloads":1706,"gated":false,"id":"gsarti/clean_mc4_it","lastModified":"2022-10-23T09:01:21.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":8990881,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":9,"isLikedByUser":false}],"position":3,"theme":"green","private":false,"shareUrl":"https://huggingface.co/collections/gsarti/it5-lrec-coling-2024-6600468041d8fee2c42021c8","upvotes":0,"isUpvotedByUser":false}],"datasets":[{"author":"gsarti","downloads":0,"gated":"manual","id":"gsarti/qe4pe","lastModified":"2024-05-01T13:21:28.000Z","private":false,"repoType":"dataset","likes":0,"isLikedByUser":false},{"author":"gsarti","downloads":93,"gated":false,"id":"gsarti/iwslt2017_context","lastModified":"2023-05-07T14:09:24.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":5548640,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":1,"isLikedByUser":false},{"author":"gsarti","downloads":2,"gated":false,"id":"gsarti/mt_geneval","lastModified":"2022-11-21T14:52:09.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":25196,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":3,"isLikedByUser":false},{"author":"gsarti","downloads":2,"gated":false,"id":"gsarti/magpie","lastModified":"2022-10-27T08:37:46.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":44451,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":2,"isLikedByUser":false},{"author":"gsarti","downloads":4205,"gated":false,"id":"gsarti/wmt_vat","lastModified":"2022-10-27T08:37:41.000Z","private":false,"repoType":"dataset","likes":8,"isLikedByUser":false},{"author":"gsarti","downloads":9001,"gated":false,"id":"gsarti/flores_101","lastModified":"2022-10-27T08:37:36.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":206927,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":16,"isLikedByUser":false},{"author":"gsarti","downloads":264,"gated":false,"id":"gsarti/change_it","lastModified":"2022-10-27T08:37:09.000Z","private":false,"repoType":"dataset","likes":1,"isLikedByUser":false},{"author":"gsarti","downloads":1706,"gated":false,"id":"gsarti/clean_mc4_it","lastModified":"2022-10-23T09:01:21.000Z","datasetsServerInfo":{"viewer":"viewer-partial","numRows":8990881,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":9,"isLikedByUser":false},{"author":"gsarti","downloads":198,"gated":false,"id":"gsarti/itacola","lastModified":"2022-07-01T15:38:55.000Z","datasetsServerInfo":{"viewer":"viewer","numRows":8776,"tags":["croissant"],"libraries":["datasets","mlcroissant"]},"private":false,"repoType":"dataset","likes":2,"isLikedByUser":false},{"author":"gsarti","downloads":0,"gated":"manual","id":"gsarti/itagen","lastModified":"2022-04-26T09:21:47.000Z","datasetsServerInfo":{"viewer":"preview","numRows":0,"tags":[],"libraries":[]},"private":false,"repoType":"dataset","likes":0,"isLikedByUser":false}],"hasMoreActivities":false,"models":[{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":151,"gated":false,"id":"gsarti/cora_mgen","lastModified":"2024-01-23T17:24:45.000Z","likes":2,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":8,"gated":false,"id":"gsarti/opus-mt-tc-base-en-ja","lastModified":"2023-06-21T14:12:24.000Z","likes":0,"pipeline_tag":"translation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":60,"gated":false,"id":"gsarti/it5-large-wiki-summarization","lastModified":"2023-06-21T07:22:58.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":515,"gated":false,"id":"gsarti/it5-base","lastModified":"2023-05-03T13:28:56.000Z","likes":20,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/it5-efficient-small-el32","lastModified":"2023-01-10T11:04:24.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":62,"gated":false,"id":"gsarti/opus-mt-tc-big-en-de","lastModified":"2023-01-07T18:45:49.000Z","likes":0,"pipeline_tag":"translation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":8,"gated":false,"id":"gsarti/opus-mt-tc-base-en-nl","lastModified":"2023-01-07T18:45:45.000Z","likes":0,"pipeline_tag":"translation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":6,"gated":false,"id":"gsarti/opus-mt-tc-base-en-ru","lastModified":"2023-01-07T18:45:35.000Z","likes":1,"pipeline_tag":"translation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":3,"gated":false,"id":"gsarti/opus-mt-tc-base-en-hi","lastModified":"2023-01-07T18:45:28.000Z","likes":0,"pipeline_tag":"translation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":127,"gated":false,"id":"gsarti/opus-mt-tc-en-pl","lastModified":"2023-01-07T18:45:22.000Z","likes":5,"pipeline_tag":"translation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":9353,"gated":false,"id":"gsarti/it5-base-news-summarization","lastModified":"2022-10-18T13:43:57.000Z","likes":3,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":13,"gated":false,"id":"gsarti/it5-efficient-small-el32-ilgiornale-to-repubblica","lastModified":"2022-10-12T13:19:18.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/it5-efficient-small-el32-repubblica-to-ilgiornale","lastModified":"2022-10-12T13:16:45.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/it5-efficient-small-el32-informal-to-formal","lastModified":"2022-10-12T13:12:49.000Z","likes":1,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/it5-efficient-small-el32-formal-to-informal","lastModified":"2022-10-12T13:11:05.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":13,"gated":false,"id":"gsarti/it5-efficient-small-el32-question-generation","lastModified":"2022-10-12T13:09:07.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":101,"gated":false,"id":"gsarti/it5-efficient-small-el32-news-summarization","lastModified":"2022-10-12T13:07:40.000Z","likes":4,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":17,"gated":false,"id":"gsarti/it5-efficient-small-el32-wiki-summarization","lastModified":"2022-10-12T13:03:53.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":12,"gated":false,"id":"gsarti/it5-efficient-small-el32-headline-generation","lastModified":"2022-10-12T12:59:39.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":2,"gated":false,"id":"gsarti/it5-base-oscar","lastModified":"2022-05-29T09:02:08.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/it5-efficient-small-el32-question-answering","lastModified":"2022-04-29T14:28:58.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":17,"gated":false,"id":"gsarti/it5-base-informal-to-formal","lastModified":"2022-03-17T09:52:48.000Z","likes":2,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":203,"gated":false,"id":"gsarti/it5-small","lastModified":"2022-03-09T11:56:34.000Z","likes":1,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":157,"gated":false,"id":"gsarti/it5-large","lastModified":"2022-03-09T11:56:08.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":37,"gated":false,"id":"gsarti/it5-base-headline-generation","lastModified":"2022-03-09T08:07:05.000Z","likes":1,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":30,"gated":false,"id":"gsarti/it5-base-wiki-summarization","lastModified":"2022-03-09T08:06:40.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":19,"gated":false,"id":"gsarti/it5-base-question-generation","lastModified":"2022-03-09T08:06:11.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":327,"gated":false,"id":"gsarti/it5-base-question-answering","lastModified":"2022-03-09T08:05:47.000Z","likes":2,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":14,"gated":false,"id":"gsarti/it5-base-repubblica-to-ilgiornale","lastModified":"2022-03-09T08:05:15.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":17,"gated":false,"id":"gsarti/it5-base-ilgiornale-to-repubblica","lastModified":"2022-03-09T08:04:46.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/it5-large-ilgiornale-to-repubblica","lastModified":"2022-03-09T08:04:16.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":14,"gated":false,"id":"gsarti/it5-small-ilgiornale-to-repubblica","lastModified":"2022-03-09T08:03:52.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/mt5-small-ilgiornale-to-repubblica","lastModified":"2022-03-09T08:03:24.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/mt5-base-ilgiornale-to-repubblica","lastModified":"2022-03-09T08:02:59.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":20,"gated":false,"id":"gsarti/it5-small-repubblica-to-ilgiornale","lastModified":"2022-03-09T08:02:27.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/it5-large-repubblica-to-ilgiornale","lastModified":"2022-03-09T08:01:50.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":12,"gated":false,"id":"gsarti/mt5-small-repubblica-to-ilgiornale","lastModified":"2022-03-09T08:01:24.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/mt5-base-repubblica-to-ilgiornale","lastModified":"2022-03-09T08:00:57.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":18,"gated":false,"id":"gsarti/it5-small-headline-generation","lastModified":"2022-03-09T08:00:22.000Z","likes":1,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":74,"gated":false,"id":"gsarti/it5-large-headline-generation","lastModified":"2022-03-09T07:59:47.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":13,"gated":false,"id":"gsarti/mt5-small-headline-generation","lastModified":"2022-03-09T07:59:17.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/mt5-base-headline-generation","lastModified":"2022-03-09T07:58:47.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":25,"gated":false,"id":"gsarti/it5-small-question-answering","lastModified":"2022-03-09T07:58:17.000Z","likes":1,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":46,"gated":false,"id":"gsarti/it5-large-question-answering","lastModified":"2022-03-09T07:57:53.000Z","likes":5,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":16,"gated":false,"id":"gsarti/mt5-base-question-answering","lastModified":"2022-03-09T07:57:29.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":10,"gated":false,"id":"gsarti/mt5-small-question-answering","lastModified":"2022-03-09T07:57:03.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":20,"gated":false,"id":"gsarti/it5-large-question-generation","lastModified":"2022-03-09T07:56:40.000Z","likes":2,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":38,"gated":false,"id":"gsarti/it5-small-question-generation","lastModified":"2022-03-09T07:55:38.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/mt5-small-question-generation","lastModified":"2022-03-09T07:55:07.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":9,"gated":false,"id":"gsarti/mt5-base-question-generation","lastModified":"2022-03-09T07:54:16.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":74,"gated":false,"id":"gsarti/it5-large-news-summarization","lastModified":"2022-03-09T07:53:26.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":16,"gated":false,"id":"gsarti/it5-small-news-summarization","lastModified":"2022-03-09T07:52:53.000Z","likes":1,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":15,"gated":false,"id":"gsarti/mt5-small-news-summarization","lastModified":"2022-03-09T07:52:27.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":4,"gated":false,"id":"gsarti/mt5-base-news-summarization","lastModified":"2022-03-09T07:51:55.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":7,"gated":false,"id":"gsarti/mt5-base-wiki-summarization","lastModified":"2022-03-09T07:51:31.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":16,"gated":false,"id":"gsarti/mt5-small-wiki-summarization","lastModified":"2022-03-09T07:51:07.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":23,"gated":false,"id":"gsarti/it5-small-wiki-summarization","lastModified":"2022-03-09T07:50:42.000Z","likes":0,"pipeline_tag":"summarization","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":16,"gated":false,"id":"gsarti/mt5-small-informal-to-formal","lastModified":"2022-03-09T07:49:29.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/mt5-base-informal-to-formal","lastModified":"2022-03-09T07:48:51.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/it5-large-informal-to-formal","lastModified":"2022-03-09T07:48:09.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":20,"gated":false,"id":"gsarti/it5-small-informal-to-formal","lastModified":"2022-03-09T07:47:36.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":8,"gated":false,"id":"gsarti/it5-large-formal-to-informal","lastModified":"2022-03-09T07:46:17.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":19,"gated":false,"id":"gsarti/it5-base-formal-to-informal","lastModified":"2022-03-09T07:45:49.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":17,"gated":false,"id":"gsarti/it5-small-formal-to-informal","lastModified":"2022-03-09T07:45:14.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":14,"gated":false,"id":"gsarti/mt5-small-formal-to-informal","lastModified":"2022-03-09T07:44:42.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":14,"gated":false,"id":"gsarti/mt5-base-formal-to-informal","lastModified":"2022-03-09T07:44:08.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/ibyt5-base","lastModified":"2021-10-04T17:42:43.000Z","likes":1,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/imt5-base","lastModified":"2021-09-27T14:45:35.000Z","likes":0,"pipeline_tag":"text2text-generation","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":17,"gated":false,"id":"gsarti/scibert-nli","lastModified":"2021-05-19T17:49:18.000Z","likes":2,"pipeline_tag":"feature-extraction","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":1,"gated":false,"id":"gsarti/covidbert-nli","lastModified":"2021-05-19T17:48:24.000Z","likes":0,"pipeline_tag":"feature-extraction","private":false,"repoType":"model","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"downloads":131659,"gated":false,"id":"gsarti/biobert-nli","lastModified":"2021-05-19T17:45:15.000Z","likes":17,"pipeline_tag":"feature-extraction","private":false,"repoType":"model","isLikedByUser":false}],"numberLikes":392,"papers":[{"id":"2405.00208","title":"A Primer on the Inner Workings of Transformer-based Language Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2405.00208.png","upvotes":5,"publishedAt":"2024-04-30T21:20:17.000Z","isUpvotedByUser":false},{"id":"2310.03686","title":"DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2310.03686.png","upvotes":3,"publishedAt":"2023-10-05T17:04:59.000Z","isUpvotedByUser":false},{"id":"2310.01188","title":"Quantifying the Plausibility of Context Reliance in Neural Machine\n Translation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2310.01188.png","upvotes":1,"publishedAt":"2023-10-02T13:26:43.000Z","isUpvotedByUser":false},{"id":"2305.17131","title":"RAMP: Retrieval and Attribute-Marking Enhanced Prompting for\n Attribute-Controlled Translation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2305.17131.png","upvotes":0,"publishedAt":"2023-05-26T17:56:53.000Z","isUpvotedByUser":false},{"id":"2302.14220","title":"Are Character-level Translations Worth the Wait? Comparing Character-\n and Subword-level Models for Machine Translation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2302.14220.png","upvotes":0,"publishedAt":"2023-02-28T00:50:19.000Z","isUpvotedByUser":false},{"id":"2302.13942","title":"Inseq: An Interpretability Toolkit for Sequence Generation Models","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2302.13942.png","upvotes":1,"publishedAt":"2023-02-27T16:45:50.000Z","isUpvotedByUser":false},{"id":"2205.12215","title":"DivEMT: Neural Machine Translation Post-Editing Effort Across\n Typologically Diverse Languages","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2205.12215.png","upvotes":0,"publishedAt":"2022-05-24T17:22:52.000Z","isUpvotedByUser":false},{"id":"2203.03759","title":"IT5: Large-scale Text-to-text Pretraining for Italian Language\n Understanding and Generation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2203.03759.png","upvotes":3,"publishedAt":"2022-03-07T22:39:01.000Z","isUpvotedByUser":false},{"id":"2108.08688","title":"Contrastive Language-Image Pre-training for the Italian Language","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2108.08688.png","upvotes":2,"publishedAt":"2021-08-19T13:53:47.000Z","isUpvotedByUser":false},{"id":"2008.10875","title":"ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation","thumbnailUrl":"https://cdn-thumbnails.huggingface.co/social-thumbnails/papers/2008.10875.png","upvotes":0,"publishedAt":"2020-08-25T08:22:38.000Z","isUpvotedByUser":false}],"posts":[{"slug":"644129530281733","content":[{"type":"text","value":"🔍 Today's (self-serving) pick in Interpretability & Analysis of LMs: ","raw":"🔍 Today's (self-serving) pick in Interpretability & Analysis of LMs: "},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"A Primer on the Inner Workings of Transformer-based Language Models ","raw":"A Primer on the Inner Workings of Transformer-based Language Models "},{"type":"new_line","raw":"\n"},{"type":"text","value":"by ","raw":"by "},{"type":"mention","user":"javifer","raw":"@javifer"},{"type":"text","value":" ","raw":" "},{"type":"mention","user":"gsarti","raw":"@gsarti"},{"type":"text","value":" ","raw":" "},{"type":"mention","user":"arianna-bis","raw":"@arianna-bis"},{"type":"text","value":" and M. R. Costa-jussà ","raw":" and M. R. Costa-jussà "},{"type":"new_line","raw":"\n"},{"type":"text","value":"(","raw":"("},{"type":"mention","user":"mt-upc","raw":"@mt-upc"},{"type":"text","value":", ","raw":", "},{"type":"mention","user":"GroNLP","raw":"@GroNLP"},{"type":"text","value":", ","raw":", "},{"type":"mention","user":"facebook","raw":"@facebook"},{"type":"text","value":")","raw":")"},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.","raw":"This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods."},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.","raw":"Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations."},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.","raw":"Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research."},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"📄 Paper: ","raw":"📄 Paper: "},{"type":"resource","resource":{"type":"paper","id":"2405.00208"},"url":"https://huggingface.co/papers/2405.00208","raw":"https://huggingface.co/papers/2405.00208","label":"A Primer on the Inner Workings of Transformer-based Language Models (2405.00208)"},{"type":"text","value":" ","raw":" "},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"🔍 All daily picks: ","raw":"🔍 All daily picks: "},{"type":"link","href":"https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9","raw":"https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9"}],"rawContent":"🔍 Today's (self-serving) pick in Interpretability & Analysis of LMs: \n\nA Primer on the Inner Workings of Transformer-based Language Models \nby @javifer @gsarti @arianna-bis and M. R. Costa-jussà \n(@mt-upc, @GroNLP, @facebook)\n\nThis primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.\n\nInterpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.\n\nFinally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.\n\n📄 Paper: https://huggingface.co/papers/2405.00208 \n\n🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9","author":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false,"isFollowing":false},"attachments":[{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/8hujqZLmtbr1qTk0lUicS.png"},{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/LARdlopywRT8octNHoMz5.png"},{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/Vs9uNQDiRck8MHXwWI18c.png"},{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/gV3shR7qgjzuP3ff_pZqV.png"}],"mentions":[{"avatarUrl":"/avatars/9c91a18cdc53587422311fd13a14833e.svg","fullname":"Arianna Bisazza","name":"arianna-bis","type":"user","isPro":false,"isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},{"avatarUrl":"/avatars/bcc94a31fab7486ca9d018245a289fb0.svg","fullname":"Javier Ferrando","name":"javifer","type":"user","isPro":false,"isHf":false}],"reactions":[{"reaction":"🚀","users":["Taylor658","giux78","lunarflu","javifer"],"count":4},{"reaction":"🧠","users":["lunarflu"],"count":1}],"publishedAt":"2024-05-03T09:03:55.000Z","updatedAt":"2024-05-03T09:03:55.666Z","commentators":[],"url":"/posts/gsarti/644129530281733","totalUniqueImpressions":1332,"numComments":0},{"slug":"983263103554280","content":[{"type":"text","value":"🔍 Today's pick in Interpretability & Analysis of LMs: by ","raw":"🔍 Today's pick in Interpretability & Analysis of LMs: by "},{"type":"mention","user":"aadityasingh","raw":"@aadityasingh"},{"type":"text","value":" T. Moskovitz, F. Hill, S. C. Y. Chan, A. M. Saxe (","raw":" T. Moskovitz, F. Hill, S. C. Y. Chan, A. M. Saxe ("},{"type":"mention","user":"gatsbyunit","raw":"@gatsbyunit"},{"type":"text","value":")","raw":")"},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"This work proposes a new methodology inspired by optogenetics (dubbed \"clamping\") to perform targeted ablations during training to estimate the causal effect of specific interventions on mechanism formation.","raw":"This work proposes a new methodology inspired by optogenetics (dubbed \"clamping\") to perform targeted ablations during training to estimate the causal effect of specific interventions on mechanism formation."},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"Authors use this approach to study the formation of induction heads training a 2L attention-only transformer to label examples via context information.","raw":"Authors use this approach to study the formation of induction heads training a 2L attention-only transformer to label examples via context information."},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"Notable findings:","raw":"Notable findings:"},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"- The effects of induction heads are additive and redundant, with weaker heads compensating well for the ablation of a strong induction head in case the latter is ablated.","raw":"- The effects of induction heads are additive and redundant, with weaker heads compensating well for the ablation of a strong induction head in case the latter is ablated."},{"type":"new_line","raw":"\n"},{"type":"text","value":"- Competition between induction heads might emerge as a product of optimization pressure to converge faster, but it is not strictly necessary as all heads eventually learn to solve the task.","raw":"- Competition between induction heads might emerge as a product of optimization pressure to converge faster, but it is not strictly necessary as all heads eventually learn to solve the task."},{"type":"new_line","raw":"\n"},{"type":"text","value":"- Previous token heads (PTH) influence induction heads in a many-to-many fashion, with any PTH eliciting above-chance prediction from a subsequent induction head","raw":"- Previous token heads (PTH) influence induction heads in a many-to-many fashion, with any PTH eliciting above-chance prediction from a subsequent induction head"},{"type":"new_line","raw":"\n"},{"type":"text","value":"- Three subcircuits for induction are identified, respectively mixing token-label information (1 + 2), matching the previous occurrence of the current class in the context (3qk + 4), and copying the label of the matched class (3v + 5).","raw":"- Three subcircuits for induction are identified, respectively mixing token-label information (1 + 2), matching the previous occurrence of the current class in the context (3qk + 4), and copying the label of the matched class (3v + 5)."},{"type":"new_line","raw":"\n"},{"type":"text","value":"- The formation of induction heads is slowed down by a larger number of classes & labels, with more classes and more labels slowing down the formation of the matching and copying mechanisms, respectively. This may have implications when selecting a vocabulary size for LLMs: larger vocabularies lead to an increased compression ratio and longer contexts, but they might make copying more challenging by delaying the formation of induction heads.","raw":"- The formation of induction heads is slowed down by a larger number of classes & labels, with more classes and more labels slowing down the formation of the matching and copying mechanisms, respectively. This may have implications when selecting a vocabulary size for LLMs: larger vocabularies lead to an increased compression ratio and longer contexts, but they might make copying more challenging by delaying the formation of induction heads."},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"💻 Code: ","raw":"💻 Code: "},{"type":"link","href":"https://github.com/aadityasingh/icl-dynamics","raw":"https://github.com/aadityasingh/icl-dynamics"},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"📄 Paper: ","raw":"📄 Paper: "},{"type":"resource","resource":{"type":"paper","id":"2404.07129"},"url":"https://huggingface.co/papers/2404.07129","raw":"https://huggingface.co/papers/2404.07129","label":"What needs to go right for an induction head? A mechanistic study of\n in-context learning circuits and their formation (2404.07129)"},{"type":"new_line","raw":"\n"},{"type":"new_line","raw":"\n"},{"type":"text","value":"🔍 All daily picks: ","raw":"🔍 All daily picks: "},{"type":"link","href":"https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9","raw":"https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9"}],"rawContent":"🔍 Today's pick in Interpretability & Analysis of LMs: by @aadityasingh T. Moskovitz, F. Hill, S. C. Y. Chan, A. M. Saxe (@gatsbyunit)\n\nThis work proposes a new methodology inspired by optogenetics (dubbed \"clamping\") to perform targeted ablations during training to estimate the causal effect of specific interventions on mechanism formation.\n\nAuthors use this approach to study the formation of induction heads training a 2L attention-only transformer to label examples via context information.\n\nNotable findings:\n\n- The effects of induction heads are additive and redundant, with weaker heads compensating well for the ablation of a strong induction head in case the latter is ablated.\n- Competition between induction heads might emerge as a product of optimization pressure to converge faster, but it is not strictly necessary as all heads eventually learn to solve the task.\n- Previous token heads (PTH) influence induction heads in a many-to-many fashion, with any PTH eliciting above-chance prediction from a subsequent induction head\n- Three subcircuits for induction are identified, respectively mixing token-label information (1 + 2), matching the previous occurrence of the current class in the context (3qk + 4), and copying the label of the matched class (3v + 5).\n- The formation of induction heads is slowed down by a larger number of classes & labels, with more classes and more labels slowing down the formation of the matching and copying mechanisms, respectively. This may have implications when selecting a vocabulary size for LLMs: larger vocabularies lead to an increased compression ratio and longer contexts, but they might make copying more challenging by delaying the formation of induction heads.\n\n💻 Code: https://github.com/aadityasingh/icl-dynamics\n\n📄 Paper: https://huggingface.co/papers/2404.07129\n\n🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9","author":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false,"isFollowing":false},"attachments":[{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/4cU-6gC798XXUvc2d3WKP.png"},{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/by0Z54O-zUDCse-arKP4M.png"},{"type":"image","url":"https://cdn-uploads.huggingface.co/production/uploads/5e7749883d77a72421292d07/_hY-tQyYJj4HNlQeK0NHU.png"}],"mentions":[{"avatarUrl":"/avatars/f8b6bf9bb349fd50d1246b176152955c.svg","fullname":"Aaditya Singh","name":"aadityasingh","type":"user","isPro":false,"isHf":false}],"reactions":[{"reaction":"❤️","users":["javifer","samusenps"],"count":2}],"publishedAt":"2024-04-25T14:21:37.000Z","updatedAt":"2024-04-25T14:21:37.138Z","commentators":[],"url":"/posts/gsarti/983263103554280","totalUniqueImpressions":2276,"numComments":0}],"totalPosts":41,"spaces":[{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"blue","colorTo":"green","createdAt":"2024-01-23T16:21:03.000Z","emoji":"🐑 🐑","id":"gsarti/pecore","lastModified":"2024-04-24T14:20:00.000Z","likes":10,"pinned":true,"private":false,"repoType":"space","runtime":{"stage":"RUNNING","hardware":{"current":"zero-a10g","requested":"zero-a10g"},"storage":null,"gcTimeout":172800,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-pecore.hf.space","isCustom":false,"stage":"READY"}]},"shortDescription":"Analyze context usage in LM generations with model internals","title":"PECoRe","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"red","colorTo":"green","createdAt":"2022-03-10T10:23:00.000Z","emoji":"🤌","id":"gsarti/it5-demo","lastModified":"2024-04-16T14:31:09.000Z","likes":6,"pinned":true,"private":false,"repoType":"space","runtime":{"stage":"SLEEPING","hardware":{"current":null,"requested":"cpu-basic"},"storage":null,"gcTimeout":86400,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-it5-demo.hf.space","isCustom":false,"stage":"READY"}]},"shortDescription":"Test fine-tuned IT5 models for Italian language generation","title":"IT5 Demo","isLikedByUser":false},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"red","colorTo":"red","createdAt":"2024-04-10T21:42:09.000Z","emoji":"🐮","id":"gsarti/grote-test-ana","lastModified":"2024-04-15T14:59:11.000Z","likes":0,"pinned":false,"private":false,"repoType":"space","runtime":{"stage":"SLEEPING","hardware":{"current":null,"requested":"cpu-basic"},"storage":null,"gcTimeout":172800,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-grote-test-ana.hf.space","isCustom":false,"stage":"READY"}]},"title":"GroTE","isLikedByUser":false,"originSpace":{"name":"grote/app","author":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5e7749883d77a72421292d07/Hh0vlbrtPUO1uQrP8yJyL.png","fullname":"Grote Testing","name":"grote","type":"org","isHf":false,"isEnterprise":false}}},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"red","colorTo":"red","createdAt":"2024-04-16T12:00:58.000Z","emoji":"🐮","id":"gsarti/grote-test-uniud","lastModified":"2024-04-15T14:59:11.000Z","likes":0,"pinned":false,"private":false,"repoType":"space","runtime":{"stage":"SLEEPING","hardware":{"current":null,"requested":"cpu-basic"},"storage":null,"gcTimeout":172800,"replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-grote-test-uniud.hf.space","isCustom":false,"stage":"READY"}]},"title":"GroTE","isLikedByUser":false,"originSpace":{"name":"gsarti/grote-test-ana","author":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false}}},{"author":"gsarti","authorData":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","fullname":"Gabriele Sarti","name":"gsarti","type":"user","isPro":false,"isHf":false},"colorFrom":"indigo","colorTo":"green","createdAt":"2024-01-10T13:31:45.000Z","emoji":"🔤🖌️","id":"gsarti/gradio_highlightedtextbox","lastModified":"2024-04-15T14:39:43.000Z","likes":3,"pinned":false,"private":false,"repoType":"space","runtime":{"stage":"RUNTIME_ERROR","hardware":{"current":null,"requested":"cpu-basic"},"storage":null,"gcTimeout":172800,"errorMessage":"Traceback (most recent call last):\n File \"/code/app.py\", line 5, in \n from gradio_highlightedtextbox import HighlightedTextbox\n File \"/usr/local/lib/python3.11/site-packages/gradio_highlightedtextbox/__init__.py\", line 2, in \n from .highlightedtextbox import HighlightedTextbox\n File \"/usr/local/lib/python3.11/site-packages/gradio_highlightedtextbox/highlightedtextbox.py\", line 15, in \n class HighlightedTextbox(FormComponent):\n File \"/usr/local/lib/python3.11/site-packages/gradio/component_meta.py\", line 198, in __new__\n create_or_modify_pyi(component_class, name, events)\n File \"/usr/local/lib/python3.11/site-packages/gradio/component_meta.py\", line 117, in create_or_modify_pyi\n current_interface, _ = extract_class_source_code(pyi_file.read_text(), class_name)\n ^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/pathlib.py\", line 1058, in read_text\n with self.open(mode='r', encoding=encoding, errors=errors) as f:\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/pathlib.py\", line 1044, in open\n return io.open(self, mode, buffering, encoding, errors, newline)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\nFileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.11/site-packages/gradio_highlightedtextbox/highlightedtextbox.pyi'\n","replicas":{"current":1,"requested":1},"devMode":false,"domains":[{"domain":"gsarti-gradio-highlightedtextbox.hf.space","isCustom":false,"stage":"READY"}]},"shortDescription":"Gradio component - Editable textarea supporting highlighting","title":"gradio_highlightedtextbox","isLikedByUser":false}],"u":{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1670231290373-5e7749883d77a72421292d07.jpeg","isPro":false,"fullname":"Gabriele Sarti","user":"gsarti","orgs":[{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1584876394587-5e7749883d77a72421292d07.png","fullname":"AI Student Society","name":"ai2s","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1674203544988-5e7749883d77a72421292d07.png","fullname":"GroNLP","name":"GroNLP","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1625068211554-5e67de201009063689407481.png","fullname":"Amazon Web Services","name":"amazon","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1634806038075-5df7e9e5da6d0311fd3d53f9.png","fullname":"BigScience Workshop","name":"bigscience","userRole":"read","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1626687520629-60d5e8e85e0f8172e7ef4c3c.png","fullname":"Italian CLIP Team","name":"clip-italian","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1654073270313-621de011fbb8b8ebb1c2bff1.png","fullname":"Responsibility Framing Project","name":"responsibility-framing","userRole":"read","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1648571187585-61ac8f8a00d01045fca0ad2f.png","fullname":"How to teach Hugging Face?","name":"teach","userRole":"read","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1673456825299-60d494e250c47659f83f5cd0.png","fullname":"Risorse per la Lingua Italiana","name":"RiTA-nlp","userRole":"write","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1624969772076-5dfcb1aada6d0311fd3d5448.jpeg","fullname":"Flax Community","name":"flax-community","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1649088498215-5e7749883d77a72421292d07.png","fullname":"Inseq","name":"inseq","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1645604801128-5e7749883d77a72421292d07.png","fullname":"IT5 Experiments","name":"it5","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5e7749883d77a72421292d07/0Kt-xRkFg_4c6igskGpjI.png","fullname":"Context MT","name":"context-mt","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/60a551a34ecc5d054c8ad93e/Ku5nM2bKq-8ZF3Jid1ocw.png","fullname":"Blog-explorers","name":"blog-explorers","userRole":"read","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5f17f0a0925b9863e28ad517/33rvDIrCmr6wpK3_W6RGz.png","fullname":"ZeroGPU Explorers","name":"zero-gpu-explorers","userRole":"read","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5f17f0a0925b9863e28ad517/nxmdd6m86cxu55UZBlQeg.jpeg","fullname":"Social Post Explorers","name":"social-post-explorers","userRole":"read","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5e7749883d77a72421292d07/bPBsfB-Dbg2PwaqVZOFkh.jpeg","fullname":"KnowGen Labs","name":"knowgen","userRole":"admin","type":"org","isHf":false},{"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/5e7749883d77a72421292d07/Hh0vlbrtPUO1uQrP8yJyL.png","fullname":"Grote Testing","name":"grote","userRole":"admin","type":"org","isHf":false}],"signup":{"github":"gsarti","homepage":"https://gsarti.com","twitter":"gsarti_","details":"Interpretability for generative language models"},"isHf":false,"type":"user"},"upvotes":65,"repoFilterModels":{"sortKey":"modified"},"repoFilterDatasets":{"sortKey":"modified"},"repoFilterSpaces":{"sortKey":"modified"},"numFollowers":176,"numFollowing":67,"isFollowing":false,"isFollower":false,"sampleFollowers":[{"user":"ksiabani","fullname":"Kostas Siabanis","type":"user","isPro":false,"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/64955109ac70da05b7aacb9a/bZKEz24ZfaWDSI33yHUmR.png"},{"user":"Makya","fullname":"Makya Baylis","type":"user","isPro":false,"avatarUrl":"/avatars/2c53d87f698f3d8ec8fbed37f77faf57.svg"},{"user":"Warung","fullname":"Alexander Sugiharto","type":"user","isPro":false,"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/65afdd59c9a5a7680f429edd/XNdpLh9gFMFGhvMDXqZLC.jpeg"},{"user":"xziayro","fullname":"xziayro","type":"user","isPro":false,"avatarUrl":"https://cdn-avatars.huggingface.co/v1/production/uploads/1673909278097-noauth.png"}],"isWatching":false}">

Gabriele Sarti

gsarti

AI & ML interests

Interpretability for generative language models

Organizations

Posts 41

view post
Post
1332
🔍 Today's (self-serving) pick in Interpretability & Analysis of LMs:

A Primer on the Inner Workings of Transformer-based Language Models
by @javifer @gsarti @arianna-bis and M. R. Costa-jussà
( @mt-upc , @GroNLP , @facebook )

This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.

Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.

Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.

📄 Paper: A Primer on the Inner Workings of Transformer-based Language Models (2405.00208)

🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
view post
Post
2276
🔍 Today's pick in Interpretability & Analysis of LMs: by @aadityasingh T. Moskovitz, F. Hill, S. C. Y. Chan, A. M. Saxe ( @gatsbyunit )

This work proposes a new methodology inspired by optogenetics (dubbed "clamping") to perform targeted ablations during training to estimate the causal effect of specific interventions on mechanism formation.

Authors use this approach to study the formation of induction heads training a 2L attention-only transformer to label examples via context information.

Notable findings:

- The effects of induction heads are additive and redundant, with weaker heads compensating well for the ablation of a strong induction head in case the latter is ablated.
- Competition between induction heads might emerge as a product of optimization pressure to converge faster, but it is not strictly necessary as all heads eventually learn to solve the task.
- Previous token heads (PTH) influence induction heads in a many-to-many fashion, with any PTH eliciting above-chance prediction from a subsequent induction head
- Three subcircuits for induction are identified, respectively mixing token-label information (1 + 2), matching the previous occurrence of the current class in the context (3qk + 4), and copying the label of the matched class (3v + 5).
- The formation of induction heads is slowed down by a larger number of classes & labels, with more classes and more labels slowing down the formation of the matching and copying mechanisms, respectively. This may have implications when selecting a vocabulary size for LLMs: larger vocabularies lead to an increased compression ratio and longer contexts, but they might make copying more challenging by delaying the formation of induction heads.

💻 Code: https://github.com/aadityasingh/icl-dynamics

📄 Paper: What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation (2404.07129)

🔍 All daily picks: https://huggingface.co/collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9