仕事の部屋: 12月 2014

2014年12月28日日曜日

141228

Haskell

union

Prelude> :m + Data.List
Prelude Data.List> :t union
union :: Eq a => [a] -> [a] -> [a]
Prelude Data.List> union [1,2,2,3,3,3,4,4,4,4] [3,4,4,5,6,6]
[1,2,2,3,3,3,4,4,4,4,5,6]
Prelude Data.List> union [3,4,4,5,6,6] [1,2,2,3,3,3,4,4,4,4]
[3,4,4,5,6,6,1,2]

foldl1

Prelude> foldl1 (+) [1..10]
55
Prelude> foldl1 (*) [1..10]
3628800
Prelude> foldl1 (lcm) [1..10]
2520
Prelude> foldl1 (gcd) [8,12,36]
4

2014年12月24日水曜日

141224

Haskell

リストの差

\\はPreludeListではなく、Data.Listに含まれている関数。

Prelude> import Data.List

Prelude Data.List> [x | x <- [220,230..320]] \\ [x+y | x <- [0,50..300], y <- [0

,80..320]]

[220,270]

2014年12月7日日曜日

141207(4)

NumPy

分割

>>> ary = np.arange(20).reshape(5, 4)
>>> first, second, third, forth = np.split(ary, [1,2,3])
>>> first
array([[0, 1, 2, 3]])
>>> forth
array([[12, 13, 14, 15],
[16, 17, 18, 19]])
>>> first, second, third = np.split(ary, [1,2], axis=1)
>>> first
array([[ 0],
[ 4],
[ 8],
[12],
[16]])
>>> third
array([[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15],
[18, 19]])

141207(3)

Pandas

パネル

>>> import pandas as pd
>>> import pandas.io.data as web
>>> pdata = pd.Panel(dict((stk, web.get_data_yahoo(stk, '12/1/2014', '12/5/2014'
)) for stk in ['AAPL', 'IBM', 'MSFT']))
>>> pdata
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 5 (major_axis) x 6 (minor_axis)
Items axis: AAPL to MSFT
Major_axis axis: 2014-12-01 00:00:00 to 2014-12-05 00:00:00
Minor_axis axis: Open to Adj Close
>>> pdata = pdata.swapaxes('items','minor')
>>> pdata['Adj Close']
AAPL IBM MSFT
Date
2014-12-01 115.07 161.54 48.62
2014-12-02 114.63 162.67 48.46
2014-12-03 115.93 164.52 48.08
2014-12-04 115.49 164.05 48.84
2014-12-05 115.00 163.27 48.42

141207(2)

Pandas

インデックス参照

>>> from pandas import Series, DataFrame
>>> frame = DataFrame(np.arange(6). reshape(3,2), index=[3,2,1])
>>> frame
0 1
3 0 1
2 2 3
1 4 5
>>> frame.irow(0)
0 0
1 1
Name: 3, dtype: int32
>>> frame.irow(1)
0 2
1 3
Name: 2, dtype: int32
>>> frame.irow(2)
0 4
1 5
Name: 1, dtype: int32
>>> frame.icol(0)
3 0
2 2
1 4
Name: 0, dtype: int32
>>> frame.icol(1)
3 1
2 3
1 5
Name: 1, dtype: int32

141207

Pandas

相関と共分散

>>> import pandas.io.data as web
>>> for ticker in ['AAPL', 'IBM', 'MSFT', 'GOOG']:
... try:
... all_data[ticker] = web.get_data_yahoo(ticker, '1/1/2005', '1/1/2010')
... price = DataFrame({tic: data['Adj Close'] for tic, data in all_data.
iteritems()})
... volume = DataFrame({tic: data['Volume'] for tic, data in all_data.it
eritems()})
... except:
... print "Cant find ", ticker
...
Cant find GOOG
>>> returns = price.pct_change()
>>> returns.tail()
AAPL IBM MSFT
Date
2009-12-24 0.034382 0.004402 0.002582
2009-12-28 0.012376 0.013315 0.005519
2009-12-29 -0.011876 -0.003410 0.006952
2009-12-30 0.012018 0.005424 -0.013808
2009-12-31 -0.004191 -0.012616 -0.015475
>>> returns.corr()
AAPL IBM MSFT
AAPL 1.000000 0.497221 0.445868
IBM 0.497221 1.000000 0.559678
MSFT 0.445868 0.559678 1.000000
>>> returns.cov()
AAPL IBM MSFT
AAPL 0.000716 0.000206 0.000236
IBM 0.000206 0.000240 0.000171
MSFT 0.000236 0.000171 0.000390
>>> returns.corrwith(volume)
AAPL -0.010122
IBM 0.031356
MSFT -0.056822
dtype: float64

2014年12月6日土曜日

141206(2)

Pandas

sum, meanの使い方

>>> a = np.arange(24).reshape((6,4))

>>> a

array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11],

[12, 13, 14, 15],

[16, 17, 18, 19],

[20, 21, 22, 23]])

>>> df = DataFrame(a, index=['a','b','c','d','e','f'], columns=['one','two','thr

ee','four'])

>>> df

one two three four

a 0 1 2 3

b 4 5 6 7

c 8 9 10 11

d 12 13 14 15

e 16 17 18 19

f 20 21 22 23

>>> df.sum()

one 60

two 66

three 72

four 78

dtype: int64

>>> df.mean()

one 10

two 11

three 12

four 13

dtype: float64

>>> df.sum(axis=1)

a 6

b 22

c 38

d 54

e 70

f 86

dtype: int64

>>> df.mean(axis=1)

a 1.5

b 5.5

c 9.5

d 13.5

e 17.5

f 21.5

dtype: float64

141206

Pandas

ラベルを使ったスライシング

>>> obj = Series(np.arange(4), index = ['a', 'b', 'c', 'd'])
>>> obj[1:2]
b 1
dtype: int32
>>> obj['b':'c']
b 1
c 2
dtype: int32