Feature Extraction
Feature extraction is the task of extracting features learnt in a model.
Input
India, officially the Republic of India, is a country in South Asia.
Dimension 1 | Dimension 2 | Dimension 3 |
---|---|---|
2.583383083343506 | 2.757075071334839 | 0.9023529887199402 |
8.29393482208252 | 1.1071064472198486 | 2.03399395942688 |
-0.7754912972450256 | -1.647324562072754 | -0.6113331913948059 |
0.07087723910808563 | 1.5942802429199219 | 1.4610432386398315 |
About Feature Extraction
Use Cases
Models trained on a specific dataset can learn features about the data. For instance, a model trained on an English poetry dataset learns English grammar at a very high level. This information can be transferred to a new model that is going to be trained on tweets. This process of extracting features and transferring to another model is called transfer learning. One can pass their dataset through a feature extraction pipeline and feed the result to a classifier.
Inference
from transformers import pipeline
checkpoint = "facebook/bart-base"
feature_extractor = pipeline("feature-extraction", framework="pt", model=checkpoint)
text = "Transformers is an awesome library!"
#Reducing along the first dimension to get a 768 dimensional array
feature_extractor(text,return_tensors = "pt")[0].numpy().mean(axis=0)
'''tensor([[[ 2.5834, 2.7571, 0.9024, ..., 1.5036, -0.0435, -0.8603],
[-1.2850, -1.0094, -2.0826, ..., 1.5993, -0.9017, 0.6426],
[ 0.9082, 0.3896, -0.6843, ..., 0.7061, 0.6517, 1.0550],
...,
[ 0.6919, -1.1946, 0.2438, ..., 1.3646, -1.8661, -0.1642],
[-0.1701, -2.0019, -0.4223, ..., 0.3680, -1.9704, -0.0068],
[ 0.2520, -0.6869, -1.0582, ..., 0.5198, -2.2106, 0.4547]]])'''
Useful resources
Compatible libraries
Note A powerful feature extraction model for natural language processing tasks.
Note A strong feature extraction model for coding tasks.
Note Wikipedia dataset containing cleaned articles of all languages. Can be used to train `feature-extraction` models.
No example Space is defined for this task.
Note Contribute by proposing a Space for this task !
No example metric is defined for this task.
Note Contribute by proposing a metric for this task !