Holidays
In [1]:
import pandas as pd
df = pd.read_csv("apple_stock_price_nodate.csv")
df.head()
Out[1]:
Open
High
Low
Close
Adj Close
Volume
0
190.679993
191.960007
189.559998
191.610001
188.737030
15989400
1
192.449997
193.660004
192.050003
193.000000
190.106216
18697900
2
193.059998
194.850006
192.429993
194.820007
191.898926
16709900
3
194.610001
195.960007
193.610001
194.210007
191.298080
19076000
4
194.990005
195.190002
190.100006
190.979996
188.116501
24024000
Using 'B' frequency is not going to help because 4th July was holiday and 'B' is not taking that into account. It only accounts for weekends
Generate US holidays calendar frequency
In [2]:
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
us_cal = CustomBusinessDay(calendar=USFederalHolidayCalendar())
rng = pd.date_range(start="2018-07-23",periods=df.shape[0],freq=us_cal)
rng
Out[2]:
DatetimeIndex(['2018-07-23', '2018-07-24', '2018-07-25', '2018-07-26',
'2018-07-27', '2018-07-30', '2018-07-31', '2018-08-01',
'2018-08-02', '2018-08-03',
...
'2019-07-10', '2019-07-11', '2019-07-12', '2019-07-15',
'2019-07-16', '2019-07-17', '2019-07-18', '2019-07-19',
'2019-07-22', '2019-07-23'],
dtype='datetime64[ns]', length=252, freq='C')
In [5]:
df.set_index(rng,inplace=True)
df.head()
Out[5]:
Open
High
Low
Close
Adj Close
Volume
2018-07-23
190.679993
191.960007
189.559998
191.610001
188.737030
15989400
2018-07-24
192.449997
193.660004
192.050003
193.000000
190.106216
18697900
2018-07-25
193.059998
194.850006
192.429993
194.820007
191.898926
16709900
2018-07-26
194.610001
195.960007
193.610001
194.210007
191.298080
19076000
2018-07-27
194.990005
195.190002
190.100006
190.979996
188.116501
24024000
You can define your own calendar using AbstractHolidayCalendar as shown below. USFederalHolidayCalendar is the only calendar available in pandas library and it serves as an example for those who want to write their own custom calendars. Here is the link for USFederalHolidayCalendar implementation https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/holiday.py
In [12]:
from pandas.tseries.holiday import AbstractHolidayCalendar, nearest_workday, Holiday
class myCalendar(AbstractHolidayCalendar):
rules = [
Holiday('My Birth Day', month=5, day=13),#, observance=nearest_workday),
]
my_bday = CustomBusinessDay(calendar=myCalendar())
pd.date_range('5/1/2019','6/1/2019',freq=my_bday)
Out[12]:
DatetimeIndex(['2019-05-01', '2019-05-02', '2019-05-03', '2019-05-06',
'2019-05-07', '2019-05-08', '2019-05-09', '2019-05-10',
'2019-05-14', '2019-05-15', '2019-05-16', '2019-05-17',
'2019-05-20', '2019-05-21', '2019-05-22', '2019-05-23',
'2019-05-24', '2019-05-27', '2019-05-28', '2019-05-29',
'2019-05-30', '2019-05-31'],
dtype='datetime64[ns]', freq='C')
Weekend in egypt is Friday and Saturday. Sunday is just a normal weekday and you can handle this custom week schedule using CystomBysinessDay with weekmask as shown below
In [15]:
egypt_weekdays = "Sun Mon Tue Wed Thu"
b = CustomBusinessDay(weekmask=egypt_weekdays)
pd.date_range(start="7/1/2018",periods=20,freq=b)
Out[15]:
DatetimeIndex(['2018-07-01', '2018-07-02', '2018-07-03', '2018-07-04',
'2018-07-05', '2018-07-08', '2018-07-09', '2018-07-10',
'2018-07-11', '2018-07-12', '2018-07-15', '2018-07-16',
'2018-07-17', '2018-07-18', '2018-07-19', '2018-07-22',
'2018-07-23', '2018-07-24', '2018-07-25', '2018-07-26'],
dtype='datetime64[ns]', freq='C')
You can also add holidays to this custom business day frequency¶
In [17]:
b = CustomBusinessDay(holidays=['2018-08-04', '2018-08-05'], weekmask=egypt_weekdays)
pd.date_range(start="8/1/2018",periods=20,freq=b)
Out[17]:
DatetimeIndex(['2018-08-01', '2018-08-02', '2018-08-06', '2018-08-07',
'2018-08-08', '2018-08-09', '2018-08-12', '2018-08-13',
'2018-08-14', '2018-08-15', '2018-08-16', '2018-08-19',
'2018-08-20', '2018-08-21', '2018-08-22', '2018-08-23',
'2018-08-26', '2018-08-27', '2018-08-28', '2018-08-29'],
dtype='datetime64[ns]', freq='C')
Mathematical operations on date object using custom business day¶
In [18]:
from datetime import datetime
dt = datetime(2017,7,9)
dt
Out[18]:
datetime.datetime(2017, 7, 9, 0, 0)
Out[19]:
Timestamp('2017-07-10 00:00:00')
Last updated
Was this helpful?