1. 程式人生 > >利用sklearn的LabelEncoder對標簽進行數字化編碼

利用sklearn的LabelEncoder對標簽進行數字化編碼

spa att sed read guide example log cat lib

from sklearn.preprocessing import LabelEncoder

def gen_label_encoder():
    labels = [BB, CC]  
    le = LabelEncoder()
    le.fit(labels)
    print le.classes_, le.classes_
    for label in le.classes_:
        print label, le.transform([label])[0]
    joblib.dump(le, data/label_encoder.h5
)

LabelEncoder的說明:

 1 class LabelEncoder(BaseEstimator, TransformerMixin):
 2     """Encode labels with value between 0 and n_classes-1.
 3 
 4     Read more in the :ref:`User Guide <preprocessing_targets>`.
 5 
 6     Attributes
 7     ----------
 8     classes_ : array of shape (n_class,)
9 Holds the label for each class. 10 11 Examples 12 -------- 13 `LabelEncoder` can be used to normalize labels. 14 15 >>> from sklearn import preprocessing 16 >>> le = preprocessing.LabelEncoder() 17 >>> le.fit([1, 2, 2, 6]) 18 LabelEncoder()
19 >>> le.classes_ 20 array([1, 2, 6]) 21 >>> le.transform([1, 1, 2, 6]) #doctest: +ELLIPSIS 22 array([0, 0, 1, 2]...) 23 >>> le.inverse_transform([0, 0, 1, 2]) 24 array([1, 1, 2, 6]) 25 26 It can also be used to transform non-numerical labels (as long as they are 27 hashable and comparable) to numerical labels. 28 29 >>> le = preprocessing.LabelEncoder() 30 >>> le.fit(["paris", "paris", "tokyo", "amsterdam"]) 31 LabelEncoder() 32 >>> list(le.classes_) 33 [‘amsterdam‘, ‘paris‘, ‘tokyo‘] 34 >>> le.transform(["tokyo", "tokyo", "paris"]) #doctest: +ELLIPSIS 35 array([2, 2, 1]...) 36 >>> list(le.inverse_transform([2, 2, 1])) 37 [‘tokyo‘, ‘tokyo‘, ‘paris‘] 38 39 See also 40 -------- 41 sklearn.preprocessing.OneHotEncoder : encode categorical integer features 42 using a one-hot aka one-of-K scheme. 43 """

利用sklearn的LabelEncoder對標簽進行數字化編碼