5.3 SVR · 使用SVR預測股票開盤價 v1.0 · Python 量化交易教程

# 5.3 SVR · 使用SVR預測股票開盤價 v1.0 > 來源：https://uqer.io/community/share/5646f635f9f06c4446b48126 ## 一、策略概述本策略主旨思想是利用SVR建立的模型對股票每日開盤價進行回歸擬合,即把前一日的 `['openPrice','highestPrice','lowestPrice','closePrice','turnoverVol','turnoverValue'] `作為當日 `'openPrice'` 的自變量，當日 `'openPrice'` 作為因變量。SVR的實現使用第三方庫scikit-learn。 ## 二、SVR [SVR詳情](http://scikit-learn.org/stable/modules/svm.html#svr) SVR參考文獻見下方 ![](https://box.kancloud.cn/2016-07-31_579d7a00e3092.jpg) ### SVM-Regression The method of Support Vector Classification can be extended to solve regression problems. This method is called Support Vector Regression. The model produced by support vector classification (as described above) depends only on a subset of the training data, because the cost function for building the model does not care about training points that lie beyond the margin. Analogously, the model produced by Support Vector Regression depends only on a subset of the training data, because the cost function for building the model ignores any training data close to the model prediction. There are three different implementations of Support Vector Regression: SVR, NuSVR and LinearSVR. LinearSVR provides a faster implementation than SVR but only considers linear kernels, while NuSVR implements a slightly different formulation than SVR and LinearSVR. As with classification classes, the fit method will take as argument vectors X, y, only that in this case y is expected to have floating point values instead of integer values: ```py >>> from sklearn import svm >>> X = [[0, 0], [2, 2]] >>> y = [0.5, 2.5] >>> clf = svm.SVR() >>> clf.fit(X, y) SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto', kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False) >>> clf.predict([[1, 1]]) array([ 1.5]) ``` Support Vector Regression (SVR) using linear and non-linear kernels: ```py import numpy as np from sklearn.svm import SVR import matplotlib.pyplot as plt ############################################################################### # Generate sample data X = np.sort(5 * np.random.rand(40, 1), axis=0) y = np.sin(X).ravel() ############################################################################### # Add noise to targets y[::5] += 3 * (0.5 - np.random.rand(8)) ############################################################################### # Fit regression model svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1) svr_lin = SVR(kernel='linear', C=1e3) svr_poly = SVR(kernel='poly', C=1e3, degree=2) y_rbf = svr_rbf.fit(X, y).predict(X) y_lin = svr_lin.fit(X, y).predict(X) y_poly = svr_poly.fit(X, y).predict(X) ############################################################################### # look at the results plt.scatter(X, y, c='k', label='data') plt.plot(X, y_rbf, c='g', label='RBF model') plt.plot(X, y_lin, c='r', label='Linear model') plt.plot(X, y_poly, c='b', label='Polynomial model') plt.xlabel('data') plt.ylabel('target') plt.title('Support Vector Regression') plt.legend() plt.show() ``` ![](https://box.kancloud.cn/2016-07-31_579d7a0109bbf.png) ## 三、PS 原本使用前一天數據預測當天的，但在 Quartz 中，交易策略被具體化為根據一定的規則，判斷每個交易日以開盤價買入多少數量的何種股票。回測不影響，但在使模擬盤時無法獲取當天的closePrice等，所以將程序改為用地n-2個交易日的數據作為自變量，第n個交易日的openPrice作為因變量。股票篩選的方法還很欠缺，本程序只用了'去除流動性差的股票'和'凈利潤增長率大于1的前N支股票'分別進行股票篩選測試，個人感覺都不很理想，還希望大牛們能提供一些有效的篩選方法。對于股票指數來說，大多數時候都無法對其進行精確的預測，本策略只做參考。期間發現通過 get_attribute_history 與 DataAPI.MktEqudGet 獲取的數據中，有些股票的數據存在一些差異。關于止損，同樣的止損策略，在其他平臺可以明顯看到，但在Uqer感覺并不起作用，不知是不是代碼編寫存在錯誤？還望大牛指正。程序寫的有點亂七八糟的，還望大家見諒，多有不足還望指導！ References: “[A Tutorial on Support Vector Regression](http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=C8B1B0729901DAD2D3C93AEBB4DBED61?doi=10.1.1.114.4288&rep=rep1&type=pdf)” Alex J. Smola, Bernhard Sch?lkopf -Statistics and Computing archive Volume 14 Issue 3, August 2004, p. 199-222 ```py # 定義SVR預測函數 def svr_predict(tickerlist,strattime_trainX,endtime_trainX,strattime_trainY,endtime_trainY,time_testX): from sklearn import svm # Get train data Per_Train_X = DataAPI.MktEqudGet(secID=tickerlist,beginDate=strattime_trainX,endDate=endtime_trainX,field=['openPrice','highestPrice','lowestPrice','closePrice','turnoverVol','turnoverValue'],pandas="1") Train_X = [] for i in xrange(len(Per_Train_X)): Train_X.append(list(Per_Train_X.iloc[i])) # Get train label Train_label = DataAPI.MktEqudGet(secID=tickerlist,beginDate=strattime_trainY,endDate=endtime_trainY,field='openPrice',pandas="1") Train_label = list(Train_label['openPrice']) # Get test data if len(Train_X) == len(Train_label): Per_Test_X = DataAPI.MktEqudGet(secID=tickerlist,tradeDate=time_testX,field=['openPrice','highestPrice','lowestPrice','closePrice','turnoverVol','turnoverValue'],pandas="1") Test_X= [] for i in xrange(len(Per_Test_X)): Test_X.append(list(Per_Test_X.iloc[i])) # Fit regression model clf = svm.SVR() clf.fit(Train_X, Train_label) # print clf.fit(Train_X, Train_label) PRY = clf.predict(Test_X) return '%.2f' %PRY[0] # retunr rount(PRY[0],2) else: pass ``` ```py from CAL.PyCAL import * from heapq import nsmallest import pandas as pd start = '2013-05-01' # 回測起始時間 end = '2015-10-01' # 回測結束時間 benchmark = 'HS300' # 策略參考標準 universe = set_universe('ZZ500') #+ set_universe('SH180') + set_universe('HS300') # 證券池，支持股票和基金 # universe = StockScreener(Factor('LCAP').nsmall(300)) #先用篩選器選擇出市值最小的N只股票 capital_base = 1000000 # 起始資金 freq = 'd' # 策略類型，'d'表示日間策略使用日線回測，'m'表示日內策略使用分鐘線回測 refresh_rate = 1 # 調倉頻率，表示執行handle_data的時間間隔，若freq = 'd'時間間隔的單位為交易日，若freq = 'm'時間間隔為分鐘 commission = Commission(buycost=0.0008, sellcost=0.0018) # 傭金萬八 cal = Calendar('China.SSE') stocknum = 50 def initialize(account): # 初始化虛擬賬戶狀態 pass def handle_data(account): # 每個交易日的買入賣出指令 global stocknum # 獲得日期 today = Date.fromDateTime(account.current_date).strftime('%Y%m%d') # 當天日期 strattime_trainY = cal.advanceDate(today,'-100B',BizDayConvention.Preceding).strftime('%Y%m%d') endtime_trainY = time_testX = cal.advanceDate(today,'-1B',BizDayConvention.Preceding).strftime('%Y%m%d') strattime_trainX = cal.advanceDate(strattime_trainY,'-2B',BizDayConvention.Preceding).strftime('%Y%m%d') endtime_trainX = cal.advanceDate(endtime_trainY,'-2B',BizDayConvention.Preceding).strftime('%Y%m%d') history_start_time = cal.advanceDate(today,'-2B',BizDayConvention.Preceding).strftime('%Y%m%d') history_end_time = cal.advanceDate(today,'-1B',BizDayConvention.Preceding).strftime('%Y%m%d') ####################################################################### # # 獲取當日凈利潤增長率大于1的前N支股票,由于API的讀取數量限制，分批運行API。 # getData_today = pd.DataFrame() # for i in xrange(300,len(account.universe),300): # tmp = DataAPI.MktStockFactorsOneDayGet(secID=account.universe[i-300:i],tradeDate=today,field=['secID','MA5','MA10','NetProfitGrowRate'],pandas="1") # getData_today = pd.concat([getData_today,tmp],axis = 0) # i = (len(account.universe) / 300)*300 # tmp = DataAPI.MktStockFactorsOneDayGet(secID=account.universe[i:],tradeDate=today,field=['secID','NetProfitGrowRate'],pandas="1") # getData_today = pd.concat([getData_today,tmp],axis = 0) # getData_today=getData_today[getData_today.NetProfitGrowRate>=1.0].dropna() # getData_today=getData_today.sort(columns='NetProfitGrowRate',ascending=False) # getData_today=getData_today.head(100) # buylist = list(getData_today['secID']) ####################################################################### # 去除流動性差的股票 tv = account.get_attribute_history('turnoverValue', 20) mtv = {sec: sum(tvs)/20. for sec,tvs in tv.items()} per_butylist = [s for s in account.universe if mtv.get(s, 0) >= 10**7] bucket = {} for stock in per_butylist: bucket[stock] = account.referencePrice[stock] buylist = nsmallest(stocknum, bucket, key=bucket.get) ######################################################################### history = pd.DataFrame() for i in xrange(300,len(account.universe),300): tmp = DataAPI.MktEqudGet(secID=account.universe[i-300:i],beginDate=history_start_time,endDate=history_end_time,field=u"secID,closePrice",pandas="1") history = pd.concat([history,tmp],axis = 0) i = (len(account.universe) / 300)*300 tmp = DataAPI.MktEqudGet(secID=account.universe[i:],beginDate=history_start_time,endDate=history_end_time,field=u"secID,closePrice",pandas="1") history = pd.concat([history,tmp],axis = 0) # history = account.get_attribute_history('closePrice', 2) # history = DataAPI.MktEqudGet(secID=account.universe,beginDate=history_start_time,endDate=history_end_time,field=u"secID,closePrice",pandas="1") history.columns = ['secID','closePrice'] keys = list(history['secID']) history.set_index('secID',inplace=True) ######################################################################## # Sell&止損 for stock in account.valid_secpos: if stock in keys: PRY = svr_predict(stock,strattime_trainX,endtime_trainX,strattime_trainY,endtime_trainY,time_testX) if (PRY < (list(history['closePrice'][stock])[-1])) or (((list(history['closePrice'][stock])[-1]/list(history['closePrice'][stock])[0])-1) <= -0.05): order_to(stock, 0) # Buy for stock in buylist: N = stocknum - len(account.valid_secpos) if (stock in keys) and (N > 0): if stock not in account.valid_secpos: PRY = svr_predict(stock,strattime_trainX,endtime_trainX,strattime_trainY,endtime_trainY,time_testX) if (PRY > list(history['closePrice'][stock])[-1]): amount = (account.cash/N)/account.referencePrice[stock] order(stock, amount) ``` ![](https://box.kancloud.cn/2016-07-30_579cbdac12c50.jpg)