python学习经典算法,python初学者必备十大算法

首页 > 经验 > 作者:YD1662022-11-03 06:41:03

Accuracy: 0.8904109589041096 决策树用于特征创造

将每日来盘价、收盘价、交易量等进行环比,得到每天是增是减的分类型变量。

# 创造更多的时间 dataset['Open_N'] = np.where(dataset['open'].shift(-1) > dataset['open'],'Up','Down') dataset['High_N'] = np.where(dataset['high'].shift(-1) > dataset['high'],'Up','Down') dataset['Low_N'] = np.where(dataset['low'].shift(-1) > dataset['low'],'Up','Down') dataset['Close_N'] = np.where(dataset['close'].shift(-1) > dataset['close'],'Up','Down') dataset['Volume_N'] = np.where(dataset['volume'].shift(-1) > dataset['volume'],'Positive','Negative') dataset.head()

python学习经典算法,python初学者必备十大算法(9)

数据预处理

X = dataset[['Open', 'Open_N', 'Volume_N']].values y = dataset['Up_Down'] from sklearn import preprocessing le_Open = preprocessing.LabelEncoder() le_Open.fit(['Up','Down']) X[:,1] = le_Open.transform(X[:,1]) le_Volume = preprocessing.LabelEncoder() le_Volume.fit(['Positive', 'Negative']) X[:,2] = le_Volume.transform(X[:,2]) from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20) 模型建立与预测

from sklearn.tree import DecisionTreeClassifier classifier = DecisionTreeClassifier() classifier.fit(X_train, y_train) # 实例化模型 Up_Down_Tree = DecisionTreeClassifier(criterion="entropy", max_depth = 4) Up_Down_Tree Up_Down_Tree.fit(X_train,y_train) # 预测 predTree = Up_Down_Tree.predict(X_test) print(predTree[0:5]) print(y_test[0:5])

['Up' 'Up' 'Up' 'Up' 'Down'] date 2019-12-31 Up 2019-12-25 Up 2018-01-11 Up 2020-08-21 Down 2019-11-20 Down Name: Up_Down, dtype: object 决策树可视化

from sklearn.tree import DecisionTreeClassifier from IPython.display import Image from sklearn import tree # pip install pydotplus import pydotplus # 创建决策树实例 clf = DecisionTreeClassifier(random_state=0) X = dataset.['open', 'high', 'low', 'volume', 'Open_Close', 'High_Low', 'Increase_Decrease', 'Buy_Sell_on_Open', 'Returns'] y = dataset['Buy_Sell'] # 训练模型 model = clf.fit(X, y) # 创建 DOT data dot_data = tree.export_graphviz(clf, out_file=None, feature_names=X.columns, class_names=X.columns) # 绘图 graph = pydotplus.graph_from_dot_data(dot_data) # 展现图形 Image(graph.create_png())

python学习经典算法,python初学者必备十大算法(10)

决策树可视化2

这里展示了整个决策树决策过程,这里看似很不清晰,但放大后,能看清每个小框框的内容:分类规则、基尼指数、样本数、类别标签等等详细内容。

python学习经典算法,python初学者必备十大算法(11)

支持向量机分类器

支持向量机(support vector machines, SVM)是一种二分类模型,它的基本模型是定义在特征空间上的间隔最大的线性分类器,间隔最大使它有别于感知机;

SVM的的学习策略就是间隔最大化,可形式化为一个求解凸二次规划的问题,也等价于正则化的合页损失函数的最小化问题。SVM的的学习算法就是求解凸二次规划的最优化算法。

SVM还包括核技巧,这使它成为实质上的非线性分类器。

Sklearn中实现SVM也是比较方便。

from sklearn.svm import SVC # "Support Vector Classifier" from sklearn.metrics import accuracy_score from sklearn.metrics import classification_report model = SVC(kernel = 'rbf', C = 1000,gamma=0.001) model.fit(X_train, y_train) svc_predictions = model.predict(X_test) print("Accuracy of SVM using optimized parameters ", accuracy_score(y_test,svc_predictions)*100) print("Report : ", classification_report(y_test,svc_predictions)) print("Score : ",model.score(X_test, y_test))

更多分类模型效果评价可参见该文中的评价指标。

上一页123末页

栏目热文

文档排行

本站推荐

Copyright © 2018 - 2021 www.yd166.com., All Rights Reserved.