2014年12月28日日曜日

141228

Haskell


union

Prelude> :m + Data.List
Prelude Data.List> :t union
union :: Eq a => [a] -> [a] -> [a]
Prelude Data.List> union [1,2,2,3,3,3,4,4,4,4] [3,4,4,5,6,6]
[1,2,2,3,3,3,4,4,4,4,5,6]
Prelude Data.List> union [3,4,4,5,6,6] [1,2,2,3,3,3,4,4,4,4]
[3,4,4,5,6,6,1,2]

foldl1

Prelude> foldl1 (+) [1..10]
55
Prelude> foldl1 (*) [1..10]
3628800
Prelude> foldl1 (lcm) [1..10]
2520
Prelude> foldl1 (gcd) [8,12,36]
4

2014年12月24日水曜日

141224

Haskell


リストの差

\\はPreludeListではなく、Data.Listに含まれている関数。

Prelude> import Data.List
Prelude Data.List> [x | x <- [220,230..320]] \\ [x+y | x <- [0,50..300], y <- [0
,80..320]]
[220,270]

2014年12月7日日曜日

141207(4)

NumPy


分割

>>> ary = np.arange(20).reshape(5, 4)
>>> first, second, third, forth = np.split(ary, [1,2,3])
>>> first
array([[0, 1, 2, 3]])
>>> forth
array([[12, 13, 14, 15],
       [16, 17, 18, 19]])
>>> first, second, third = np.split(ary, [1,2], axis=1)
>>> first
array([[ 0],
       [ 4],
       [ 8],
       [12],
       [16]])
>>> third
array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15],
       [18, 19]])

141207(3)

Pandas


パネル

>>> import pandas as pd
>>> import pandas.io.data as web
>>> pdata = pd.Panel(dict((stk, web.get_data_yahoo(stk, '12/1/2014', '12/5/2014'
)) for stk in ['AAPL', 'IBM', 'MSFT']))
>>> pdata
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 5 (major_axis) x 6 (minor_axis)
Items axis: AAPL to MSFT
Major_axis axis: 2014-12-01 00:00:00 to 2014-12-05 00:00:00
Minor_axis axis: Open to Adj Close
>>> pdata = pdata.swapaxes('items','minor')
>>> pdata['Adj Close']
              AAPL     IBM   MSFT
Date
2014-12-01  115.07  161.54  48.62
2014-12-02  114.63  162.67  48.46
2014-12-03  115.93  164.52  48.08
2014-12-04  115.49  164.05  48.84
2014-12-05  115.00  163.27  48.42

141207(2)

Pandas


インデックス参照

>>> from pandas import Series, DataFrame
>>> frame = DataFrame(np.arange(6). reshape(3,2), index=[3,2,1])
>>> frame
   0  1
3  0  1
2  2  3
1  4  5
>>> frame.irow(0)
0    0
1    1
Name: 3, dtype: int32
>>> frame.irow(1)
0    2
1    3
Name: 2, dtype: int32
>>> frame.irow(2)
0    4
1    5
Name: 1, dtype: int32
>>> frame.icol(0)
3    0
2    2
1    4
Name: 0, dtype: int32
>>> frame.icol(1)
3    1
2    3
1    5
Name: 1, dtype: int32

141207

Pandas


相関と共分散

>>> import pandas.io.data as web
>>> for ticker in ['AAPL', 'IBM', 'MSFT', 'GOOG']:
...     try:
...         all_data[ticker] = web.get_data_yahoo(ticker, '1/1/2005', '1/1/2010')
...         price = DataFrame({tic: data['Adj Close'] for tic, data in all_data.
iteritems()})
...         volume = DataFrame({tic: data['Volume'] for tic, data in all_data.it
eritems()})
...     except:
...         print "Cant find ", ticker
...
Cant find  GOOG
>>> returns = price.pct_change()
>>> returns.tail()
                AAPL       IBM      MSFT
Date
2009-12-24  0.034382  0.004402  0.002582
2009-12-28  0.012376  0.013315  0.005519
2009-12-29 -0.011876 -0.003410  0.006952
2009-12-30  0.012018  0.005424 -0.013808
2009-12-31 -0.004191 -0.012616 -0.015475
>>> returns.corr()
          AAPL       IBM      MSFT
AAPL  1.000000  0.497221  0.445868
IBM   0.497221  1.000000  0.559678
MSFT  0.445868  0.559678  1.000000
>>> returns.cov()
          AAPL       IBM      MSFT
AAPL  0.000716  0.000206  0.000236
IBM   0.000206  0.000240  0.000171
MSFT  0.000236  0.000171  0.000390
>>> returns.corrwith(volume)
AAPL   -0.010122
IBM     0.031356
MSFT   -0.056822
dtype: float64

2014年12月6日土曜日

141206(2)

Pandas


sum, meanの使い方

>>> a = np.arange(24).reshape((6,4))
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
>>> df = DataFrame(a, index=['a','b','c','d','e','f'], columns=['one','two','thr
ee','four'])
>>> df
   one  two  three  four
a    0    1      2     3
b    4    5      6     7
c    8    9     10    11
d   12   13     14    15
e   16   17     18    19
f   20   21     22    23
>>> df.sum()
one      60
two      66
three    72
four     78
dtype: int64
>>> df.mean()
one      10
two      11
three    12
four     13
dtype: float64
>>> df.sum(axis=1)
a     6
b    22
c    38
d    54
e    70
f    86
dtype: int64
>>> df.mean(axis=1)
a     1.5
b     5.5
c     9.5
d    13.5
e    17.5
f    21.5
dtype: float64

141206

Pandas


ラベルを使ったスライシング

>>> obj = Series(np.arange(4), index = ['a', 'b', 'c', 'd'])
>>> obj[1:2]
b    1
dtype: int32
>>> obj['b':'c']
b    1
c    2
dtype: int32