optim/README.md

107 lines
4 KiB
Markdown
Raw Normal View History

2017-04-11 22:23:07 -07:00
# neural network stuff
2017-03-14 03:10:47 -07:00
2020-04-08 16:08:49 -07:00
not unlike [my DSP repo,](https://github.com/notwa/dsp)
`onn` is a bunch of half-baked python code that's kinda handy.
2017-03-14 03:10:47 -07:00
i give no guarantee anything provided here is correct.
don't expect commits, docs, or comments to be any verbose.
2020-04-08 16:08:49 -07:00
however, i do attempt to cite and source any techniques used.
2017-03-14 03:10:47 -07:00
2020-04-08 16:08:49 -07:00
## alternatives
2017-03-14 03:10:47 -07:00
2020-04-08 16:08:49 -07:00
when creating this, i wanted a library free of compilation
and heavy dependencies — other than numpy and scipy, but these are commonplace.
although `onn` is significantly faster than equivalent autograd code,
performance is not a concern and it cannot run on GPU.
since this is my personal repo, i recommend that others do not rely on it.
instead, consider one of the following:
2017-03-14 03:10:47 -07:00
* [keras](https://github.com/fchollet/keras)
2020-04-08 16:08:49 -07:00
it's now integrated directly into [tensorflow.](https://tensorflow.org).
it runs on CPU and GPU. however, it requires a compilation stage.
* also check out the
2017-03-14 03:10:47 -07:00
[keras-contrib](https://github.com/farizrahman4u/keras-contrib)
2020-04-08 16:08:49 -07:00
library for more keras components based on recent papers.
* the library itself may be discontinued, but
[theano's source code](https://github.com/Theano/theano/blob/master/theano/tensor/nnet/nnet.py)
contains pure numpy test methods as reference.
2017-03-14 03:10:47 -07:00
* [minpy](https://github.com/dmlc/minpy)
for tensor-powered numpy routines and automatic differentiation.
2020-04-08 16:08:49 -07:00
deprecated by [mxnet.](https://github.com/apache/incubator-mxnet)
i've never used either so i don't know what mxnet is like.
2017-03-14 03:10:47 -07:00
* [autograd](https://github.com/HIPS/autograd)
for automatic differentiation without tensors.
2020-04-08 16:08:49 -07:00
this is my personal favorite, although it is a little slow.
* autograd has been discontinued in favor of
[Google's JAX,](https://github.com/google/jax)
however, JAX is quite heavy and non-portable in comparison.
JAX runs on CPU and GPU and it can skip compilation on CPU.
2017-03-14 03:10:47 -07:00
## dependencies
python 3.5+
2020-04-08 16:08:49 -07:00
mandatory packages: `numpy` `scipy`
2018-03-09 01:17:31 -08:00
2020-04-08 16:08:49 -07:00
needed for saving weights: `h5py`
2018-03-09 01:17:31 -08:00
2020-04-08 16:08:49 -07:00
used in example code: `dotmap`
2017-03-14 03:10:47 -07:00
2017-06-17 18:58:40 -07:00
## minimal example
```python
#!/usr/bin/env python3
2020-04-08 16:08:49 -07:00
import numpy as np
import mnists # https://github.com/notwa/mnists
2018-01-21 14:04:25 -08:00
from onn import *
2020-04-08 16:08:49 -07:00
train_x, train_y, valid_x, valid_y = mnists.prepare("mnist")
learning_rate = 0.01
epochs = 20
batch_size = 500
hidden_size = 196 # 1/4 the number of pixels in an mnist sample
reg = L1L2(1e-5, 1e-4)
final_reg = None
x = Input(shape=train_x.shape[1:]) # give the shape of a single example
y = x # superficial code just to make changing layer order a little easier
2017-06-17 18:58:40 -07:00
y = y.feed(Flatten())
2020-04-08 16:08:49 -07:00
y = y.feed(Dense(hidden_size, init=init_he_normal, reg_w=reg, reg_b=reg))
y = y.feed(Dropout(0.5))
y = y.feed(GeluApprox())
2017-06-17 18:58:40 -07:00
y = y.feed(Dense(10, init=init_glorot_uniform, reg_w=final_reg, reg_b=final_reg))
y = y.feed(Softmax())
2020-04-08 16:08:49 -07:00
model = Model(x, y, # follow the graph from node x to y
loss=CategoricalCrossentropy(), mloss=Accuracy(),
unsafe=True) # skip some sanity checks to go faster
2017-06-17 18:58:40 -07:00
2020-04-08 16:08:49 -07:00
optim = Adam() # good ol' adam
learner = WaveCLR(optim, upper_rate=learning_rate,
epochs=epochs, period=epochs) # ramp up and down the rate
ritual = Ritual(learner=learner) # the accursed deep-learning ritual
ritual.prepare(model) # reset training
2017-06-17 18:58:40 -07:00
while learner.next():
print("epoch", learner.epoch)
2020-04-08 16:08:49 -07:00
losses = ritual.train(*batchize(train_x, train_y, batch_size))
print("train accuracy", "{:6.2%}".format(losses.avg_mloss))
def print_error(name, train_x, train_y):
losses = ritual.test_batched(train_x, train_y, batch_size)
print(name + " loss", "{:12.6e}".format(losses.avg_loss))
print(name + " accuracy", "{:6.2%}".format(losses.avg_mloss))
print_error("train", train_x, train_y)
print_error("valid", valid_x, valid_y)
predicted = model.evaluate(train_x) # use this as you will!
2017-06-17 18:58:40 -07:00
```
2020-04-08 16:08:49 -07:00
[(mnists is available here)](https://github.com/notwa/mnists)
2017-03-14 03:10:47 -07:00
## contributing
i'm just throwing this code out there,
so i don't actually expect anyone to contribute,
*but* if you do find a blatant issue,
maybe [yell at me on twitter.](https://twitter.com/antiformant)